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Abstract 

Fractal  geometry  arises  in  a  truly  extraordinary  range  of  natural  and  man-made  phenom¬ 
ena.  The  1//  family  of  fractal  random  processes,  in  particular,  are  appealing  candidates 
for  data  modeling  in  a  wide  variety  of  signal  processing  scenarios  involving  such  phenom¬ 
ena.  In  contrast  to  the  well-studied  family  of  ARM  A  processes,  1//  processes  are  typically 
characterized  by  persistent  long-term  correlation  structure.  However,  the  mathematical  in¬ 
tractability  of  such  processes  has  largely  precluded  their  use  in  signal  processing.  We  intro¬ 
duce  and  develop  a  powerful  Karhunen-Loeve-like  representation  for  1//  processes  in  terms 
of  orthonormal  wavelet  bases  that  considerably  simplifies  their  analysis.  Wavelet-based  rep¬ 
resentations  yield  highly  convenient  synthesis  and  whitening  filters  for  1//  processes,  and 
allow  a  number  of  fundamental  detection  and  estimation  problems  involving  Iff  processes 
to  be  readily  solved.  In  particular,  we  obtain  robust  and  computationally  efficient  algo¬ 
rithms  for  parameter  and  signal  estimation  with  Iff  signals  in  noisy  backgrounds,  coherent 
detection  in  1//  backgrounds,  and  optimal  discrimination  between  Iff  signals.  Results 
from  a  variety  of  simulations  are  presented  to  demonstrate  the  viability  of  the  algorithms. 

In  contrast  to  the  statistically  self-similar  Iff  processes,  homogeneous  signals  are  gov¬ 
erned  by  deterministic  self-similarity.  Orthonormal  wavelet  bases  play  an  equally  important 
role  in  the  representation  of  these  signals,  and,  in  fact,  are  used  to  construct  orthonoraial 
“self-similar”  bases.  The  spectral  and  fractal  charactei'istics  of  homogeneous  signals  make 
them  appealing  candidates  for  use  in  a  number  of  applications.  As  one  potential  example, 
we  consider  the  use  of  homogeneous  signal  sets  in  a  communications-based  context.  In 
particular,  we  develop  a  strategy  for  embedding  information  into  a  homogeneous  waveform 
on  all  time-scales.  The  result  is  a  unique  multirate  modulation  strategy  that  is  well-suited 
for  use  with  noisy  channels  of  simultaneously  unknown  duration  and  bandwidth.  Com¬ 
putationally  efficient  modulators  and  demodulators  are  developed  for  the  scheme,  and  the 
results  of  a  preliminary  performance  evaluation  are  presented.  Although  not  yet  a  fully 
developed  protocol,  “fractal  modulation”  represents  a  novel  and  compelling  paradigm  for 
communication. 
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Chapter  1 


Introduction 


There  are  a  wide  range  of  contexts  in  which  there  is  a  need  to  be  able  to  synthe¬ 
size,  analyze  and  process  fractal  signals.  Indeed,  fractal  geometry  abounds  in  nature. 
Fractal  structure  can  be  found,  for  example,  in  natural  landscapes,  in  the  distribu¬ 
tion  of  earthquakes,  in  ocean  waves,  in  turbulent  flow,  in  the  pattern  of  errors  on 
communication  channels,  in  the  bronchi  of  the  human  lung,  and  even  in  fluctuations 
of  the  stock  market.  In  many  applications,  we  are  interested  in  modeling  the  inher¬ 
ent  fractal  behavior  in  order  that  we  might  perform  some  form  of  signal  processing. 
For  example,  there  are  many  problems  of  detection,  classification,  smoothing,  and 
prediction  that  involve  fractal  signals.  Likew'ise,  that  fractal  behavior  is  so  prevalent 
suggests  that  fractal  geometry  is  somehow  optimal  or  efficient.  Consequently,  there  is 
increasing  interest  in  the  design  of  communication,  telemetry,  and  other  engineering 
systems  based  on  the  use  of  fractal  signals. 

This  thesis  is  about  the  development  and  exploitation  of  a  framework  for  represent¬ 
ing  and  characterizing  fractal  signals.  But  what  is  a  fractal  signal?  Most  generally, 
a  fractal  signal  is  a  function  possessing  structure  at  every  scale  of  detail.  However, 
the  fractals  of  most  interest,  and  those  to  which  we  restrict  our  attention  in  this 
thesis,  are  those  for  which  the  detail  at  each  scale  is  similar.  In  this  case  we  say  that 
the  fractal  is  self-similar  or,  alternatively,  scale-invariant,  reflecting  the  fact  that  the 
signal  has  no  absolute  scale  of  reference.  Fractal  signals  may  be  classified  into  one 
of  two  broad  categories:  those  in  which  the  self-similarity  is  statistical,  and  those  in 
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which  it  is  deterministic.  For  statistically  self-similar  signals,  the  detail  at  all  scales 
have  the  same  statistics,  while  for  deterministically  self-similar  signals  the  detail  at 
all  scales  is  identical. 

Various  representations  for  self-similar  signals  can  be  found  in  the  literature;  how¬ 
ever,  none  have  been  particularly  suitable  for  engineering  applications  either  for  rea¬ 
sons  of  mathematical  intractability  or  computationally  complexity.  In  this  thesis,  we 
introduce  and  develop  highly  efficient  representations  for  some  important  classes  of 
fractal  signals  based  on  the  use  of  orthonormal  wavelet  bases.  Orthonormal  wavelet 
bases,  having  the  property  that  all  basis  functions  are  dilations  and  translations  of 
some  prototype  function,  are  in  man}'’  respects  ideally  suited  for  use  with  self-similar 
signals.  In  fact,  as  will  become  apparent  in  the  ensuing  chapters,  wavelet-based  rep¬ 
resentations  are  as  convenient  and  natural  for  self-similai'  signals  as  Fourier-based 
representations  are  for  stationary  and  periodic  signals.  Furthermore,  because  wave¬ 
let  transformations  can  be  implemented  in  a  computationally  efficient  manner,  the 
wavelet  transform  is  not  only  a  theoretically  important  tool,  but  a  practical  one  as 
well. 

We  specifically  consider  two  families  of  self-similar  signals.  The  first  is  the  family 
of  1//  processes.  These  statistically  self-similar  processes,  specifically,  are  important 
candidates  for  modeling  a  wide  range  of  natural  and  man-made  phenomena.  Due 
to  their  generally  nonstationary  character,  1//  processes  have  properties  that  are 
rather  distinct  from  the  traditional  models  used  in  signal  processing.  In  contrast 
to  the  well-studied  family  of  ARMA  processes,  for  example,  1//  processes  typically 
exhibit  long-term  statistical  dependence.  Yet  despite  their  apparent  applicability  in 
many  contexts.  Iff  models  have  not  enjoyed  widespread  use  in  the  signal  processing 
community.  In  large  part,  this  has  been  due  to  the  lack  of  a  sufficiently  convenient 
mathematical  characterization.  The  introduction  of  wavelet-based  representations 
for  1  //  processes  in  this  thesis  allows  us  to  address  a  wide  range  of  signal  modeling 
and  signal  processing  problems  involving  Iff  processes  in  a  highly  straightforward 
manner.  In  particular,  we  are  able  to  obtain  computationally  efficient  algorithms  both 
for  classifying,  parametrizing,  and  isolating  Iff  signals.  Furthermore,  in  contrast  to 
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previous  algorithms,  those  we  develop  are  robust  with  respect  to  both  measurement 
noise  and  modeling  errors. 

The  second  family  of  self-similar  signals  we  consider  are  homogeneous  signals, 
which  we  characterize  in  terms  of  a  novel  deterministic  self-similarity  relation.  Ho¬ 
mogeneous  signals,  too,  have  highly  efficient  wavelet-based  representations,  and  are 
potentially  useful  in  a  wide  range  of  engineering  applications.  As  an  example  of  one 
promising  direction  for  applications,  we  consider  the  use  of  homogeneous  signal  sets 
in  a  communications-based  context.  Specifically,  we  develop  an  approach  for  embed¬ 
ding  information  into  homogeneous  waveforms  which  we  term  “fractal  modulation.” 
Because  the  resulting  waveforms  have  the  property  that  the  information  can  be  re¬ 
covered  with  either  arbitrarily  little  duration  or  arbitrarily  little  bandwidth,  we  are 
able  to  show  that  such  signals  are  well-suited  for  transmission  over  noisy  channels 
of  simultaneously  unknown  duration  and  bandwidth.  Not  only  is  this  a  reasonable 
model  for  many  physical  channels,  but  also  of  the  receiver  constraints  inherent  in 
many  point-to-point  and  broadcast  communication  scenarios.  As  a  consequence  of 
its  special  fractal  and  spectral  properties,  fractal  modulation  is  potentially  useful  in  a 
range  of  military  and  commercial  communication  contexts.  Indeed,  the  concepts  un¬ 
derlying  fractal  modulation  may  ultimately  lead  to  novel  and  important  approaches 
for  both  low  probability  of  intercept  and  shared-spectrum  communications. 

1.1  Outline  of  the  Thesis 

The  detailed  structure  of  the  thesis  is  as  follows.  Chapter  2  is  a  review  of  wavelet  the¬ 
ory.  In  addition  to  establishing  notation  and  summarizing  the  important  results,  this 
review  provides  a  particular  perspective  on  wavelets  and  their  relationship  to  signal 
processing  that  is  central  to  the  thesis.  Orthonormal  wavelet  basis  signal  decompo¬ 
sitions  are  interpreted  first  in  terms  of  an  octave-band  filter  bauk  structure  that  is 
familiar  to  signal  processors,  and  then  in  terms  of  a  multiresolution  signal  analysis 
from  which  new  insights  are  obtained.  In  particular,  we  show  how  this  interpretation 
leads  naturally  to  the  computationally  efficient  discrete-time  implementation  via  the 
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discrete  wavelet  transform. 

Chapter  3  reviews  the  1  / /-family  of  statistically  self-similar  random  processes,  and 
develops  some  important  new  models  for  l//-like  behavior  in  signals.  In  particular, 
early  in  the  chapter,  we  introduce  a  novel  and  useful  frequency-based  characterization 
for  1//  processes,  while  in  the  latter  half  of  the  chapter  we  develop  wavelet-based  rep¬ 
resentations  for  1//  processes.  Specifically,  we  demonstrate  that  orthonormal  wave¬ 
let  basis  expansions  are  Karhunen-Loeve-like  expansions  for  l//-type  processes,  i.e., 
when  1//  processes  are  expanded  in  terms  of  orthonormal  wavelet  bases,  the  coefii- 
cients  of  the  expansion  are  effectively  uncorrelated.  This  powerful  result  is  supported 
both  theoretically  and  empirically,  and  examples  involving  both  simulated  and  real 
data  are  included. 

Exploiting  the  efficiency  of  wavelet  basis  expansions  for  1/f  processes.  Chapter  4 
develops  solutions  to  some  fundamental  problems  of  detection  and  estimation  involv¬ 
ing  1  //-type  signals.  In  particular,  we  develop  both  maximum  likelihood  parameter 
estimation  algorithms  and  minimum  mean-square  error  signal  estimation  algorithms 
for  1//  processes  embedded  in  white  measurement  noise.  Additionally,  we  address  the 
problem  of  coherent  detection  in  1//  backgrounds,  as  well  as  the  problem  of  discrimi¬ 
nating  between  1  //  signals  with  different  parameters.  In  each  case,  we  provide  useful 
interpretations  of  the  solutions  to  these  problems  in  terms  of  wavelet-based  synthesis 
and  whitening  filters  for  1/f  processes.  Results  from  a  variety  of  simulations  are 
presented. 

Chapter  5  introduces  and  develops  our  new  family  of  homogeneous  signals  defined 
in  terms  of  a  dyadic  scale-invariance  property.  We  distinguish  between  two  classes: 
energ>"-dominated  and  power-dominated,  and  develop  their  spectral  properties.  We 
show  that  orthonormal  self-similar  bases  can  be  constructed  for  homogeneous  signals 
using  wavelets.  Using  these  representations,  we  then  derive  highly  efficient  discrete¬ 
time  algorithms  for  synthesizing  and  analyzing  homogeneous  signals. 

Chapter  6  develops  the  concept  of  fractal  modulation.  In  particular,  we  use  the  or¬ 
thonormal  self-similar  basis  expansions  derived  in  Chapter  5  to  develop  an  approach 
for  modulating  discrete-  or  continuous-valued  information  sequences  onto  homoge- 


14 


neous  signals.  After  developing  the  corresponding  optimal  receivers,  we  evaluate  the 
performance  of  the  resulting  scheme  in  the  context  of  a  particular  channel  model. 
Our  analysis  includes  comparisons  to  more  traditional  forms  of  modulation. 

Chapter  7  represents  a  rather  preliminary  and  cursory  investigation  into  the  sys¬ 
tem  theoretic  foundations  of  the  thesis.  In  particular,  after  defining  scale-invariant 
systems,  we  explore  the  relationships  between  such  systems,  self-similar  signals,  and 
the  wavelet  transform.  We  observe  that  synthesis  filters  for  the  self-similar  signals 
we  consider  in  the  thesis  are  exactly  or  approximately  linear  jointly  time-  and  scale- 
invariant  systems.  Furthermore,  we  demonstrate  that  while  the  Laplace  and  Fourier 
representations  are  natural  for  linear  time-invariant  systems,  and  while  the  Mellin 
representation  is  natural  for  linear  scale-invariant  systems,  it  is  the  wavelet  transform 
that  is  most  natural  for  linear  systems  that  are  jointly  time-  and  scale-invariant.  We 
show,  in  fact,  that  wavelet  representations  lead  to  some  very  efficient  and  practical 
computational  structures  for  characterizing  and  implementing  such  systems. 

Finally,  Chapter  8  summarizes  the  principal  contributions  of  the  thesis  and  sug¬ 
gests  some  interesting  and  potentially  important  directions  for  future  research. 
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Chapter  2 


Wavelet  Transformations 


Wavelet  transformations  play  a  central  role  in  the  study  of  self-similai'  signals  and 
systems.  Indeed,  as  we  shall  see,  the  wavelet  transform  constitutes  as  natural  a 
tool  for  the  manipulation  of  self-similar  or  scale- invariant  signals  as  the  Fourier 
transform  does  for  translation-invariant  signals  such  as  stationary,  cyclostationary, 
and  periodic  signals.  Furthermore,  just  as  the  discovery  of  fast  Fourier  transform 
(FFT)  algorithms  dramatically  increased  the  viability  the  Fourier- based  processing 
of  translation-invariant  signals  in  real  systems,  the  existence  of  fast  discrete  wave¬ 
let  transform  (DWT)  algorithms  for  implementing  wavelet  transformations  means 
that  wavelet-based  representations  of  self-similar  signals  are  also  of  great  practical 
significance. 

The  theory  of  wavelet  transformations  dates  back  to  the  work  of  Grossmann  and 
Morlet  [2],  and  was  motivated  by  applications  in  seismic  data  analysis  [3].  Many  key 
results  in  the  theory  of  nonorthogonal  wavelet  expansions  are  described  by  Daubechies 
in  [4].  In  this  thesis,  however,  we  shall  be  primarily  interested  in  orthonormal  wave¬ 
let  bases.  The  development  of  such  bases,  and  their  interpretation  in  the  context 
of  multiresolution  signal  analysis,  is  generally  attributed  to  Meyer  [5]  and  Mallat 
[6].  However,  it  was  Daubechies  who  introduced  the  first  highly  practical  families  of 
orthonormal  wavelet  bases  in  her  landmark  paper  [1]. 

Yet  although  wavelet  theory  is  rather  new,  it  is  important  to  note  at  the  outset 
that  many  of  the  ideas  underlying  wavelets  are  not  new.  Indeed,  wavelet  theory  can 
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be  viewed  as  a  convenient  and  useful  mathematical  framework  for  formalizing  and 
relating  some  well-established  methodologies  from  a  number  of  diverse  areas  within 
mathematics,  physics,  and  engineering.  Examples  include: 

-  pyramidal  image  decompositions  in  computer  vision  [7], 

-  multigrid  methods  in  the  solution  of  partial-differential  and  integral  equations 

[8], 

-  spectrogram  methods  in  speech  recognition  [9], 

-  progressive  transmission  algorithms  and  embedded  coding  in  communications 
[10]  [11],  and 

-  multirate  filtering  algorithms  in  digital  audio  [12],  speech  and  image  coding  [13], 
voice  scrambling  [12],  and  frequency  division  data  multiplexing  [14]. 

In  fact,  wavelet  transformations  are  closely  associated  with  a  number  of  topics  that 
have  been  extensively  explored  in  the  signal  processing  literature  in  particular,  includ¬ 
ing  constant-Q  filter  banks  and  time-frequency  analysis  [15],  and  quadrature  mirror 
and  conjugate  quadrature  filter  banks  [12]. 

This  chapter  is  designed  as  a  self-contained  overview  of  wavelet  transformations 
in  general  and  of  orthonormal  wavelet  transformations  in  particular.  Although  it 
presents  essentially  no  new  results,  it  serves  three  main  purposes.  First,  it  establishes 
the  notational  conventions  for  wavelets  we  adopt  for  the  thesis.  Second,  it  summarizes 
the  key  results  from  wavelet  theory  we  shall  exploit  in  the  applications  in  subsequent 
chapters  of  the  thesis.  However,  the  third  purpose  of  the  chapter  is  to  introduce 
wavelet  transformations  from  a  signal  processing  perspective,  and  it  is  this  objective 
which  has  lead  to  the  rather  tutorial  style  of  this  chapter.  While  a  number  of  excellent 
introductions  to  wavelet  theory  tutorials  can  be  found  in  the  literature — see,  e.5.,  [6] 
[4]  [16]  [17] — we  stress  that  the  one  presented  here  emphasizes  a  perspective  that  is 
particularly  important  in  light  of  the  applications  we  consider  in  this  thesis. 
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2.1  Wavelet  Bases 


Most  generally,  the  wavelet  transformation  of  a  signal  x{t) 

x{t)  ~ 

is  defined  in  terms  of  projections  of  x{t)  onto  a  family  of  functions  that  are  all  nor¬ 
malized  dilations  and  translations  of  a  prototype  “wavelet”  function  i.e., 

/OO 

x{t)ruit)dt  (2.1) 

-oc 

where 

In  this  notation,  ^  and  u  are  the  continuous  dilation  and  translation  parameters, 
respectively,  and  take  values  in  the  range  — oo  <  fj,,u  <  (x>,  ^  0.  A  necessary  and 

sufficient  condition  for  this  transformation  to  be  invertible  is  that  tp{t)  satisfy  the 
admissibility  condition 

/OO 

|$(u))p  du)  =  C^<  oo,  (2.2) 

-OO 

where  ^(o;)  is  the  wavelet’s  Fourier  transform.  Provided  ^(t)  has  reasonable  decay 
at  infinity,  (2.2)  is  equivalent  to  the  admissibility  condition 

/oo 

^{t)di  =  Q.  (2.3a) 

-OO 

For  any  admissible  ^(t),  the  synthesis  formula  corresponding  to  the  analysis  formula 
(2.1)  is  then 

i(i)  =  {a:;}  =  —  /  /  xsr.Wix-Uiidu.  (2.4) 

OO  J  —00 

Under  certain  circumstances,  it  is  also  possible  to  reconstruct  x{t)  solely  from 
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samples  of  on  some  lattice  defined  by 


=  a""* 

V  =  nb  a~^ 

where  —  oo  <  m  <  oo  and  —  oo  <  n  <  oo  are  the  integer  dilation  and  translation 
indices,  respectively,  and  a  and  b  are  the  corresponding  dilation  and  translation  in¬ 
crements.  In  such  cases,  the  collection  of  samples  is  termed  a  “frame.”  A  general 
theory  and  some  iterative  reconstruction  algorithms  are  presented  in  [4].  However, 
it  is  also  possible  to  construct  wavelets  and  lattices  such  that  the  resulting  trans¬ 
formation  is  not  only  invertible,  but  orthonormal  as  well.  In  general,  orthonormal 
transformations  are  extremely  convenient  analytically,  and  possess  very  nice  numer¬ 
ical  properties.  Consequently,  it  is  this  class  of  wavelet  transformations  that  is  of 
primary  interest  in  this  work,  and  the  theory  is  summarized  in  the  sequel. 

2.2  Orthonormal  Wavelet  Bases 

Our  focus  in  this  section  is  on  the  particular  case  of  dyadic  orthonormal  wavelet  bases, 
corresponding  to  the  case  a  =  2  and  6=1  for  which  the  theory  is  comparatively  better 
developed.  In  Section  2.2.7,  however,  we  construct  a  simple  family  of  orthonormal 
wavelet  bases  corresponding  to  lattices  defined  by  a  =  (L  -f  1)/L  and  h  =  L  where 
X  >  1  is  an  integer. 

An  orthonormal  wavelet  transformation  of  a  signal  x{t) 

x(t)  ^  I” 

can  be  described  in  terms  of  the  synthesis/ analysis  equations 

x(«)  =  =  (2.5a) 

m  n 

/oo 

x{t)ip^{t)dt  (2.5b) 

•OO 
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and  has  the  special  property  that  the  orthogonal  basis  functions  are  all  dilations  and 
translations  of  a  single  function  referred  to  as  the  basic  wavelet  In  particular, 

(2.6) 


where  m  and  n  are  the  dilation  and  translation  indices,  respectively. 

An  important  example  of  a  wavelet  basis,  and  one  to  which  we  will  refer  on 
numerous  occasions  throughout  the  thesis,  is  that  derived  from  the  ideal  bandpass 
wavelet  ^(t).  This  wavelet  is  the  impulse  response  of  an  ideal  bandpass  filter  with 


frequency  response 


4' (a;) 


1  IT  <\uj\  <2'K 
0  otherwise 


(2.7) 


It  is  straightforward  to  verify  that  the  dilations  and  translations  of  ^{i)  constitute 
an  orthonormal  basis  for  the  space  of  finite  energy  functions,  L^(R).  However,  there 
are  many  other  examples  of  orthonormal  wavelet  bases. 

The  basic  (or  “mother”)  wavelet,  ■fp{t)  typically  has  a  Fourier  transform  ^(u)  that 
satisfies  several  more  general  properties.  First,  because,  for  a  fixed  m  the  {t/’^(t)} 
constitute  an  orthonormal  set  we  get  the  Poisson  formula 


53l'J'(w-27r/:)p  =  1 

k 

whence 

i^(t^)l  <  1-  (2.8a) 

Moreover,  from  (2.3)  we  have  immediately 


^(0)  =  0. 


(2.8b) 


Finally,  we  are  generally  interested  in  regular  bases,  i.e.,  bases  comprised  of  regular 
basis  functions.  Regularity  is  a  measure  of  the  smoothness  of  a  function.  In  particular, 
a  function  f{t)  will  be  said  to  be  Rth-order  regular  if  its  Fourier  transform  F{u)  decays 
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according  to^ 

F{u})  ~  O  ,  |w|  — ►  oo. 

We  use  the  term  “regular”  to  denote  a  function  that  is  at  least  first-order  regular,  and 
note  that  an  i?th-order  regular  function  has  R—l  regular  derivatives.  Consequently, 
in  order  for  our  wavelet  basis  to  be  regular  we  require  that 

^(cu)  ~  O  (l^^r^) ,  OO-  (2.8c) 

As  implied  by  (2.8a)-(2.8c),  ^(t)  is  often  the  impulse  response  of  an  at  least 
roughly  bandpass  filter.  Consequently,  the  wavelet  transformation  can  usually  be 
interpreted  either  in  terms  of  a  generalized  constant-Q  (specifically,  octave-band) 
filter  bank,  or,  as  we  shall  see  later,  in  terms  of  a  multiresolution  signal  analysis. 
\Miile  we  will  restrict  our  attention  to  this  class  of  wavelet  bases,  it  is  important  to 
remark,  however,  that  wavelets  need  not  correspond  to  either  an  octave-band  filter 
bank  or  a  multiresolution  analysis.  For  example,  the  following  wavelet  due  to  Mallat 
[18] 

»■ 

1  if  47r/7  <  |a;|  <  TT  or  47r  <  |a;|  <  327r/7 

'F(ct))  = 

0  otherwise 

generates  a  perfectly  valid  orthonormal  wavelet  basis. 


^The  order  notation  0{-}  used  in  this  thesis  is  to  be  understood  in  the  following  sense.  If 

F{u)  =  OiG{ij)),  w-^oo 


F(uj) 

hm  — r-r  <  oo. 
G(u;) 
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2.2.1  An  Octave-Band  Filter  Bank  Interpretation 

The  filter  bank  interpretation  of  the  wavelet  transform  arises  by  viewing  the  analysis 
equation  (2.5b)  as  a  filter-and-sample  operation,  viz., 

=  Uii)  *  • 

Although  the  interpretation  applies  more  generally,  it  is  often  convenient  to  visualize 
the  basis  associated  with  the  ideal  bandpass  wavelet  (2.7).  In  this  case,  the  output  of 
each  filter  in  the  bank  is  sampled  at  the  corresponding  Nyquist  rate.  More  generally, 
we  say  that  the  filter  bank  is  critically- sampled  [15],  in  that  reconstruction  is  not  pos¬ 
sible  if  any  of  the  sampling  rates  are  reduced  regardless  of  the  choice  of  wavelet.  The 
critically-sampled  filter  bank  corresponding  to  the  wavelet  decomposition  is  depicted 
in  Fig.  2-1. 

For  a  particular  choice  of  wavelet  basis,  the  magnitude  of  the  frequency  response 
of  the  filters  in  such  a  filter  bank  is  portrayed  in  Fig.  2-2.  As  this  figure  illustrates, 
there  can  be  significant  spectral  overlap  in  the  magnitude  responses  while  preserving 
the  orthogonality  of  the  decomposition.  In  essence,  while  the  frequency  response 
magnitudes  are  not  supported  on  disjoint  frequency  intervals,  aliasing  is  avoided — 
i.e.,  perfect  reconstruction  and  orthogonality  are  achieved — due  to  the  characteristics 
of  the  phase  in  the  filters.  However,  it  is  possible  to  construct  wavelet  bases  such  that 
the  spectral  overlap  between  channels  is  much  smaller  in  applications  where  this  is 
important. 

A  filter  bank  decomposition  is  closely  related  to  the  notion  of  a  local  time- 
frequency  analysis.  Provided  the  filters  are  reasonably  bandpass  in  character,  the 
output  of  each  filter  in  the  bank  is  an  estimate  of  the  frequency  content  in  the  signal 
localized  to  the  corresponding  frequency  band.  Likewise,  provided  the  filter  impulse 
responses  are  localized  in  time,  the  sequence  of  output  samples  from  each  filter  gives  a 
picture  of  the  time-evolution  of  frequency  content  within  the  corresponding  frequency 
band.  In  the  case  of  the  wavelet  decomposition,  represents  an  estimate  of  the 

energ}"  of  the  signal  x{t)  in  the  vicinity  of  t  ~  2“’”n,  and  for  a  band  of  frequencies 
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frequency  co 

Figure  2-2:  The  octave  band  filters  corresponding  to  an  orthonormal  wavelet  decom¬ 
position.  The  wavelet  basis  in  this  example  is  one  due  to  Daubechies  [1]. 

in  the  neighborhood  of  a;  ~  2"*7r.  This  is  graphically  depicted  in  the  time-frequency 
plane  of  Fig.  2-3(a).  Note  that  the  octave-band  frequency  partitioning  leads  to  a 
partitioning  of  the  time  axis  that  is  finer  in  the  higher  (and  wider)  frequency  bands. 
We  emphasize  that  the  partitioning  in  this  figure  is  idealized:  in  accordance  with  the 
Fourier  transform  uncertainty  principle,  one  cannot  have  perfect  localization  in  both 
time  and  frequency.  Nevertheless,  one  can  construct  wavelet  bases  whose  basis  func¬ 
tions  have  their  energy  concentrated  at  least  roughly  according  to  this  partitioning. 

In  contrast  to  the  wavelet  transform,  the  familiar  short-time  Fourier  transform 
(STFT)  representation  of  a  signal  corresponds  to  a  filter  bank  in  which  the  filters 
are  modulated  versions  of  one  another  and,  hence,  have  equal  bandwidth.  As  a 
consequence,  the  outputs  are  sampled  at  identical  rates,  and  the  corresponding  time- 
frequency  analysis  is  one  in  which  there  is  uniform  partitioning  of  both  the  time  and 
frequency  axes  in  the  time-frequency  plane,  as  depicted  in  Fig.  2-3(b). 

While  the  wavelet  transform  analysis  equation  (2.5b)  can  be  interpreted  in  terms 
of  a  filter  bank  decomposition,  the  corresponding  synthesis  equation  (2.5a)  may  be 
interpreted,  as  depicted  in  Fig.  2-4,  as  multirate  modulation  in  which  for  a  given  m 
each  sequence  of  coefficients  x'^  is  modulated  onto  the  corresponding  wavelet  dilate 
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(b)  Short-time  Fourier  transformation 


Figure  2-3;  Time-frequency  portraits  corresponding  to  two  signal  analyses. 
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•  • 


•  • 

•  • 

•  • 

Figure  2-4:  Interpretation  of  an  orthonormal  wavelet  expansion  as  a  multirate  mod¬ 
ulation  scheme. 

at  rate  2™.  For  the  case  of  the  ideal  bandpass  wavelet,  this  corresponds  to 
modulating  each  such  sequence  into  the  distinct  octave  frequency  band  2’”7r  < 
^  <  2'«+i7r. 

The  filter  bank  interpretation  allows  us  to  readily  derive  the  following  useful  iden¬ 
tity 

5:i^(2-"‘a;)|2  =  l  (2.9) 

m 

valid  for  all  orthonormal  wavelet  bases  and  any  u  ^  0.  Specifically,  consider  an 
arbitrary  finite-energy  signal  x{t)  with  Fourier  transform  which  is  decomposed 
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into  an  orthonormal  wavelet  basis  via  the  filter  bank  of  Fig.  2-1,  then  immediately  re¬ 
synthesized  according  to  the  filter  bank  of  Fig.  2-4.  It  is  a  straightforward  application 
of  sampling  theory  to  show  that  the  Fourier  transform  of  the  output  of  this  cascade 
can  be  expressed  as 

X{iv)  |^(2-"‘w)P  +  52  E  2'”)^(2-’"cj)^*(2-'"a;  -  27rfc). 

m  k^O  "I 

Since  this  must  be  equal  to  X(cj),  the  terms  on  the  right  must  all  be  zero,  while  the 
factor  multiplying  X(u>)  must  be  unity,  yielding  the  identity  (2.9)  as  desired. 

While  the  filter  bank  interpretation  provides  a  natural,  convenient,  and  familiar 
framework  in  which  to  view  orthonormal  wavelet  transformations,  it  is  also  possible  to 
view  the  transformation  in  the  context  of  a  multiresolution  signal  analysis  framework 
[5]  [6]  [1].  This  perspective,  which  we  consider  next,  provides  a  number  of  additional, 
rich  insights  into  orthonormal  wavelet  bases. 

2.2.2  Multiresolution  Signal  Analysis  Interpretation 

In  general,  a  multiresolution  signal  anal3'sis  is  a  framework  for  analj^zing  signals  based 
on  isolating  variations  in  the  signal  that  occur  on  different  temporal  or  spatial  scales. 
This  strategy  underlies  a  variety  of  diverse  signal  processing  algorithms  including 
pyramidal  methods  used  in  the  solution  of  computer  vision  problems  [19]  and  multi¬ 
grid  methods  used  in  the  solution  of  boundary  value  problems  [8].  The  basic  anal3fsis 
algorithm  involves  approximating  the  signal  at  successively  coarser  scales  through 
repeated  application  of  a  smoothing  or  averaging  operator.  At  each  stage,  a  differ¬ 
encing  operation  is  used  to  extract  a  detail  signal  capturing  the  information  between 
consecutive  approximations.  The  matching  synthesis  algorithm  involves  a  succes¬ 
sive  refinement  procedure  in  which,  starting  from  some  coarsest  scale  approximation, 
detail  signals  are  accumulated  in  order  to  generate  successively  finer  scale  signal  ap¬ 
proximations. 

Specifically,  orthonormal  wavelet  bases  can  be  interpreted  in  the  context  of  a  par¬ 
ticular  class  of  linear  multiresolution  signal  analyses  in  which  signal  approximations 
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at  all  resolutions  of  the  form  2'”  (for  m  an  integer)  are  defined.  In  describing  this 
class,  we  begin  formally  by  considering  the  Hilbert  space  of  square-integrable  signals 
V=L2(R),  A  multiresolution  signal  analysis  is  then  defined  as  a  decomposition  of 
this  signal  space  V  into  a  sequence  of  subspaces 


V  V  V  V 

...  ,  r_i,  1^0,  Vi,  ra,  •  •  • 


such  that  each  Vm  defines  signal  approximations  at  a  resolution  2*".  Associated  with 
each  Vm  is  a,  linear  operator  Am  that  defines  projections  from  anywhere  in  V  onto 
Kn.  That  is,  for  each  signal  x(t)  €  V,  the  projection  €  Kn  defines  the  closest 

signal  of  resolution  2"*  to  x(t), 

Amx(t)  =  arg  min  ||a:(t)  -  w(t)j|. 

Central  to  the  concept  of  multiresolution  analysis  is  the  notion  of  being  able  to 
construct  successively  coarser  resolution  approximations  by  repeated  application  of  a 
smoothing  operator.  Mathematically,  this  characteristic  is  obtained  by  imposing  the 
nesting  or  causality  relation 

Vm  C  Vm+i,  (2.10a) 

which  specifically  ensures  that  the  approximation  of  a  signal  at  resolution  2’”'^^  con¬ 
tains  all  the  information  necessary  to  approximate  the  signal  at  the  coai’ser  resolution 

2m. 

Am  —  Am^(t)- 

The  relations 


00 

U  V.  =  V 

(2.10b) 

m=s~oo 

n  K.  =  {0} 

tn=— oo 

(2.10c) 

ensure  that  a  complete  range  of  approximations  is  defined  by  the  analysis.  In  the 
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process,  these  completeness  relations  define  arbitrarily  good  and  arbitrarily  poor  ap¬ 
proximations  that  are  consistent  with  any  intuitive  notion  of  resolution,  i.e., 

lim  Amxit)  =  xit) 

m— ^00 

lim  AmxU)  =  0. 

m— ►—00 

An  additional  relation  is  required  to  fully  define  the  notion  of  resolution:  signals 
in  Vm  must  be  characterized  by  2”*  samples  per  unit  length.  Mathematically,  this 
can  be  interpreted  as  requiring  that  there  exist  an  isometry  between  each  space  of 
functions  Vm  and  the  space  of  square-summable  sequences  I  =  1^(Z) 

Vm  1  (2.10d) 

such  that  each  sequence  represents  samples  of  the  corresponding  signal  following  some 
potentially  rather  arbitrary  linear  processing; 

x{t)  eVm  ^  ^  (2.10e) 

where  ipm  is  a  linear  operator. 

In  general,  eqs.  (2.10a)  -  (2.10e)  are  adequate  to  define  a  multiresolution  signal 
analysis.  However,  imposing  two  additional  constraints  leads  to  an  analysis  with  some 
nice  structure.  The  first  is  a  translation-invariance  constraint,  viz., 

x{t)  eVm  ^  x{t  -  2-'”n)  €  Vm  (2.10f) 

which  ensures  that  the  nature  of  the  approximation  of  the  signal  x{t)  is  the  same  for 
any  time  interval.  It  is  this  condition  that  shall  lead  to  the  translational  relationships 
among  basis  functions  in  the  corresponding  wavelet  expansion.  The  second  is  a  scale- 
invariance  constraint 

3:{t)  €  Ki  x{2t)  €  Kn+i  (2.10g) 
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which  ensures  that  the  nature  of  the  approximation  at  each  resolution  is  the  same. 
In  turn,  it  is  this  condition  that  shall  give  rise  to  the  dilational  relationships  among 
basis  functions  in  the  corresponding  wavelet  expansion. 

It  can  be  shown  [18]  that  every  multiresolution  analysis  of  L^(R),  z.e.,  every 
collection  of  subspaces  V-m  defined  in  accordance  with  (2.10a)  -  (2.10g),  is  completely 
characterized  in  terms  of  a  scaling  function  (or  “father”  wavelet)  Consequently, 
from  the  scaling  function  one  can  construct  an  orthonormal  basis  for  each  V^,  and, 
hence,  the  approximation  operator  Am  for  each  of  these  subspaces.  In  particular,  for 
each  m, 

...  ,^:ri(<),c(<),<^r(O,02"’(O,  ••• 

constitutes  an  orthonormal  basis  for  Vm,  where  the  basis  functions,  as  a  consequence 
of  the  invariance  constraints  (2.10f)  and  (2.10g)  imposed  on  the  multiresolution  anal¬ 
ysis,  are  all  dilations  and  translations  of  one  another,  i.e., 

<^^(t)  =  _  n).  (2.11) 

The  corresponding  resolution-2"*  approximation  of  a  signal  x{t)  is  then  obtained 
as  the  projection  of  x{t)  onto  V^,  which,  exploiting  the  convenience  of  an  orthonormal 
basis  expansion,  is  expressed  as 

A„x(t)  =  <it)  (2-12) 

n 

with  the  coefficients  a”  computed  according  to  the  individual  projections 

J-^oo 

In  general,  4){t)  has  a  Fourier  transform  $(a;)  that  is  at  least  roughly  lowpass. 
Using  an  argument  similar  to  that  which  led  to  (2.8a),  orthonormality  of  the  basis 
{<i^n(0}n  implies  that 

l$(a))|  <  1.  (2.14a) 


(2.13) 
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Additionally,  because  this  basis  for  Vm  is  asymptotically  complete  in  V {cf.  (2.10b)) 
we  have 

i$(0)l  =  1.  {2.14b) 

Finally,  since  we  are,  again,  generally  interested  in  regular  bases,  we  must  have 

$(a;)  ~  O  (|w|-^)  ,  u;  ^  oo.  (2.14c) 


Collectively,  the  properties  (2.14a)  -  (2.14c)  describe  a  scaling  function  that  is  con¬ 
sistent  with  the  notion  that  Am  is  an  approximation  or  smoothing  operator.  Con¬ 
sequently,  we  may  interpret  the  projection  (2.13)  as  a  lowpass-like  filter- an d-sample 
operation,  viz., 

=  W*)*CW}ll=2-"’n-  (2-15) 

Moreover,  (2.12)  can  be  interpreted  as  a  modulation  of  these  samples  onto  a  lowpass- 
like  waveform. 

In  fact,  one  example  of  a  multiresolution  analysis  is  generated  from  the  ideal 
lowpass  scaling  function  ^(<),  whose  Fourier  transform  is  the  frequence  response  of 
an  ideal  lowpass  filter,  i.e., 


$(w)  = 


1  |u)|  <  TT 

0  |w|  >  TT 


(2.16) 


In  this  case,  the  corresponding  multiresolution  analysis  is  based  upon  perfectly  ban- 
dlimited  signal  approximations.  Specifically,  for  a  signal  x{t),  Amx{t)  represents  x{t) 
bandlimited  to  a;  =  2"‘7r.  Furthermore,  we  may  interpret  (2.15)  and  (2.12)  in  the 
context  of  classical  sampling  theory  [20].  In  particular,  0(t)  in  (2.15)  plays  the  role 
of  an  anti-aliasing  filter  [21],  while  (2.12)  is  the  interpolation  formula  associated  with 
the  sampling  theorem. 

Of  course,  there  are  practical  difficulties  associated  with  the  implementation  of  a 
multiresolution  analysis  based  upon  perfectly  bandlimited  approximations,  foremost 
of  which  is  that  the  sampling  and  reconstruction  filters,  i.e.,  the  4>^(t),  are  unrealiz- 
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able.  For  this  reason,  this  analysis  is  more  of  pedagogical  than  practical  interest. 

To  derive  the  wavelet  basis  associated  with  each  multiresolution  analysis  defined 
via  (2.10),  we  now  shift  our  attention  from  the  sequence  of  increasingly  coarse  scale 
approximation  signals  Amx{t)  to  the  detail  signals  representing  the  information  lost 
at  each  stage  as  the  resolution  is  halved.  The  collection  of  resolution-limited  signal 
approximations  constitutes  a  highly  redundant  representation  of  the  signal.  By  con¬ 
trast,  the  collection  of  detail  signals  constitutes  a  much  more  eflficient  representation. 
Formally,  we  proceed  by  decomposing  each  space  into  the  subspace  Vm  and  its 
orthogonal  complement  subspace  Omi  i-^-,  Om  satisfies 

Ora  ±  Vm  (2.17a) 

Om  e  Vm  =  Vm+l  (2.17b) 

where  we  recognize  that  it  is  in  this  orthogonal  complement  subspace  that  the  detail 
signal  resides. 

Associated  with  every  multiresolution  analysis  is  a  basic  wavelet  i>{t)  which  yields 
the  following  orthonormal  basis  for  each  Om- 

••• 

where  t^™(t)  is  as  defined  in  terms  of  dilations  and  translations  of  ^{t)  as  per  (2.6). 
In  turn,  this  leads  to  a  convenient  description  of  the  projection  operator  Dm  from 
anywhere  in  V  onto  Om  as 


£)„x(()=5:xjcw 

n 

in  terms  of  the  individual  projections  (c/.  (2.5b)) 

r  x{t)^^{t)dt. 

•/— oo 

Hence,  we  have  the  interpretation  that  the  wavelet  coefficients  x'^  for  a  fixed  m  corre- 
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spond  to  the  detail  signal  Dm^{t)  at  scale  2"*,  or,  more  specifically,  to  the  information 
in  the  signal  x{t)  between  the  resolution- 2"'  and  resolution-2’"'^^  approximations,  i.e., 


At  this  point,  we  recognize  the  wavelet  associated  with  the  bandlimited  multires¬ 
olution  analysis  defined  via  (2.16)  to  be  the  ideal  bandpass  wavelet  (2.7);  it  suffices 
to  consider  a  frequency  domain  perspective.  To  complete  the  discussion,  we  observe 
that  via  (2.17)  we  can  recursively  decompose  any  of  the  approximation  subspaces  Vm, 
for  some  M,  into  the  direct  sum  of  a  sequence  of  orthogonal  subspaces,  i.e., 

Vxi  =  =  Om-iB{Om-2®Vm-2)  =  •••  =  ®  Om-  (2-18) 

m<M 

from  which  we  see  that  for  every  x{t) 

AMx(t)  =  Y,  D„x(t)  =  «(*)■  (219) 

m<M  m<M  n 

This  leads  naturally  to  the  interpretation  of  AMx{t)  as  an  approximation  in  which 
details  on  scales  smaller  than  2^  are  discarded.  Letting  M  — »  oo  in  (2.19)  yields 

m  n 

the  synthesis  formula  (2.5a),  and  corresponds  to  the  subspace  decomposition 

v=  ©  o„. 

ms=— oo 

This  completes  our  interpretation  of  an  orthonormal  wavelet  basis  as  a  multiresolution 
signal  analysis. 
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2.2.3  Discrete  Wavelet  Transform 


The  discrete  wavelet  transform  (DWT)  refers  to  a  discrete-time  framework  for  im¬ 
plementing  the  orthonormal  wavelet  transform.  The  basic  notion  is  that  rather 
than  implementing  the  analysis  directly  as  a  sequence  of  continuous-time  filter-and- 
sample  operations  according  to  (2.2.1),  one  can  reformulate  the  analysis  into  a  single 
continuous-to-discrete  conversion  procedure  followed  by  some  iterative  discrete-time 
processing.  Likewise,  the  synthesis  can  be  reformulated  from  a  series  of  conven¬ 
tional  modulations  (2.5a)  into  an  iterative  discrete-time  procedure  followed  by  a  single 
discrete-to-continuous  conversion. 

The  implementation  is  based  upon  the  discrete-time  filters 


/OO 

dt  (2.20a) 

■OO 

/OO 

(2.20b) 

•OO 


Tj^pically,  h[n]  and  g[n]  have  Fourier  transforms  H{u)  and  G{uj)  that  have  roughly 
halfband  lowpass  and  highpass  characteristics,  respectively.  In  fact,  for  the  case  of 
the  bandlimited  multiresolution  signal  analysis,  h[n]  and  g[n]  are  ideal  lowpass  and 
highpass  filters,  specifically 


H{uj) 

G(u) 


1  0  <  |a)|  <  7r/2 
0  7r/2  <  |w|  <  TT 

0  0  <  (a;j  <  7r/2 
1  ir/2  <  1^1  <  TT 


More  generally,  as  we  shall  see,  the  filters  h[n]  and  g\n]  form  a  quadrature  mirror 
filter  (QMF)  or  conjugate  quadrature  filter  (CQF)  pair. 

The  analysis  algorithm  is  structured  as  follows.  Given  a  signal  x{t)  €  V  from 
which  we  would  like  to  extract  for  m  <  M,  we  can  obtain  the  approximation 
coefficients  via  the  filter-and-sample  procedure  of  (2.15),  then  recursively  apply 
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the  following  filter-downsample  algorithm 


<  =  (2.21a) 

i 

<  =  E9[l-2n]aT^^  (2.21b) 

i 

to  extract  the  transform  coefficients  x’^  corresponding  to  successively  coarser  scales 
m.  A  detailed  derivation  of  this  algorithm  is  presented  in  Appendix  A. 

The  synthesis  algorithm  is  structured  in  a  complementary  fashion.  In  particular, 
to  reconstruct  x(t)  to  resolution  from  for  m  <  M,  we  can  recursively  apply 
the  upsample-filter-merge  algorithm 

=  ^{%-2/]a;"  +  i?[n-2/]xf}  (2.21c) 

i 

to  compute  the  coefficients  of  successively  finer  scale  approximations  until  level 
m  =  M  is  reached,  after  which  AM+ix{t)  may  be  constructed  by  modulating 
according  to  (2.12).  A  detailed  derivation  of  this  algorithm  is  also  left  to  Appendix  A. 

Fig.  2-5  depicts  the  discrete-time  relationships  between  approximation  and  detail 
coefficients  corresponding  to  adjacent  scales.  The  complete  algorithm  for  computing 
wavelet  coefficients  based  on  the  discrete  wavelet  transform  is  depicted  in  Fig.  2-6. 

The  DWT  may  be  computed  extremely  efficiently  using  polyphase  forms  [17]. 
Indeed,  if  the  filters  h[n]  and  3[7i]  have  length  L,  an  implementation  of  the  DWT 
via  an  FFT-based  algorithm  generally  has  an  asymptotic  computational  complexity 
of  O(logL)  per  input  sample  [22].  However,  as  discussed  in  [17]  this  figure  can 
be  somewhat  misleading  as  there  are  many  subtle  issues  associated  with  measuring 
complexity  of  the  algorithm. 

2.2.4  Finite  Data  Length  and  Resolution  Effects 

In  most  applications,  the  data  consists  of  a  finite  collection  of  samples 

x[n],  n  =  0, 1,  ...  ,N. 
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a 


m+l 
n  = . 


^  n  =  1.0.1. 2.... 


■I.O.I.2.. 


(a)  The  analysis  step:  filter-downsample. 


....-1.0.1. 2.... 


....-1.0.1. 2.... 


m^l 

^n^.. ..-1.0.1. 2.... 


(b)  The  synthesis  step:  upsample-filter-merge. 

Figure  2-5:  A  single  stage  of  discrete-wavelet  transform  algorithm. 
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XM  X  X  "“2 


(a)  The  analysis  algorithm. 


X  j;  x^ 


(b)  The  synthesis  algorithm 

Figure  2-6:  An  efficient  implementation  of  the  orthonormal  wavelet  transformation 
based  on  the  discrete  wavelet  transform. 
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While  it  is  usually  assumed  that  the  x[u]  correspond  to  samples  of  a  resolution-limited 
approximation  of  a  continuous-time  signal  x{t),  i.e., 

a:[n]  =  af +'  = 

for  some  M,  this  cannot  always  be  justified.  Nevertheless,  if  the  signal  x{t)  was 
processed  by  an  a  typical  anti-aliasing  filter  prior  to  sampling,  then  it  is  often  a 
useful  approximation,  particularly  if  the  anti-aliasing  filter  has  characteristics  similar 
to  that  of  the  smoothing  filter  associated  with  the  approximation  operator. 

Note  that  while  the  discrete-time  nature  of  the  data  limits  access  to  the  finer  scales 
of  detail,  the  length  of  the  observations  limits  access  to  the  coarser  scales  of  detail. 
Hence,  in  practice  we  typically  have  access  to  wavelet  coefficients  over  a  finite  range 
of  scales  for  a  given  signal.  Moreover,  because  the  effective  width  of  the  wavelet  basis 
functions  halves  at  each  finer  scale,  we  expect  roughly  a  doubling  of  the  number  of 
available  coefficients  at  each  successively  finer  scale.  In  a  typical  scenario,  for  a  data 
record  of  =  No  2''^  samples,  we  would  expect  to  be  able  to  extract  corresponding 
to 


m  =  1,2,  ...  ,M 
n  =  0,1,  ...  ,A^o2'"~^  -  1 

via  the  DWT,  where  Nq  is  a  constant  that  depends  on  the  particular  wavelet  basis. 

Note  that  while  there  are  a  number  of  ways  to  handle  the  unusual  data  window¬ 
ing  problem  inherent  in  the  wavelet  decomposition,  an  assumption  that  the  data  is 
periodic  outside  the  observation  window  leads  to  a  computationally  convenient  im¬ 
plementation,  and  one  we  shall  use  in  the  context  of  this  thesis.  See  [23]  for  some 
discussion  and  alternative  approaches  for  addressing  issues  of  windowing. 
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2.2.5  Orthonormal  Wavelet  Basis  Constructions 

As  we  have  indicated,  for  every  multiresolution  analysis  characterized  by  a  scaling 
function,  there  exists  an  associated  wavelet  basis.  In  fact,  it  is  possible  to  exploit 
the  structure  of  the  discrete  wavelet  transform  to  show  how  the  wavelet  ip{t)  may 
always  be  derived  directly  from  the  scaling  function  0(t).  In  this  section  we  describe 
how  this  is  accomplished.  More  generally,  we  show  how  one  can  construct  a  family 
of  orthonormal  wavelet  bases  directly  from  a  class  of  discrete-time  filters. 

We  begin  by  observing  that  there  are  a  number  of  properties  that  the  discrete¬ 
time  filters  /i[7i]  and  g[n]  corresponding  to  a  multiresolution  signal  must  satisfy.  For 
instance,  as  a  consequence  of  orthogonality  constraints  between  the  and 

{(t>n{^)}i  oae  can  show  [18]  that  h[n]  and  ^[n]  must  be  related  by 

p[n]  =  (-1)"  h[l  -  n] 

which,  expressed  in  the  frequency  domain,  is 

G{u)  =  +  tt).  (2.22) 

Furthermore,  orthonormality  of  the  require  that  h[n]  satisfy 

li/(0)p  =  2  (2.23a) 

\H{uj)\^  +  \H{u  +  7cf  =  2.  (2.23b) 

Filter  pairs  that  satisfy  both  (2.22)  and  (2.23)  are  termed  conjugate  quadrature  or 
quadrature  mirror  filters  and  have  been  discussed  extensively  in  the  signal  processing 
literature  [12]. 

Note  that  (2.22)  leads  immediately  to  an  algorithm  for  constructing  the  wavelet 
corresponding  to  a  particular  scaling  function;  one  can  generate  /i[n]  from  (f){t)  via 
(2.20a),  ^[n]  from  h[n]  via  (2.22),  then  V’(^)  from  g[n]  and  (f){t)  via  (A. lb). 

However,  note  that  h[n]  alone  is  also  sufficient  to  fully  characterize  a  wavelet  basis 
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through  a  multiresolution  analysis.  Indeed,  given  h(n],  the  dilation  equation^  (A.la) 
can  be  solved  for  the  corresponding  scaling  function  In  particular,  (f>{t)  has 

Fourier  transform 

=  n  (2.24) 

m=l 

which  is  intuitively  reasonable  from  a  recursive  decomposition  of  the  corresponding 
frequency  domain  equation,  viz.,  (A. 2a). 

In  fact,  the  conditions  (2.23)  on  /i[n]  are  necessary  but  not  sufficient  for  (2.24)  to 
generate  a  regular  wavelet  basis.  However,  choosing  h[n]  to  satisfy  both  (2.23)  and 
to  have  a  Fourier  transform  H{uj)  with  R  zeros  at  w  =  tt,  i.e., 

=  r  =  0,1,...,/?- 1 

is  sufficient  to  generate  a  wavelet  basis  with  /?th-order  regularity.  Moreover,  in  this 
case,  we  find,  via  (A. 2a),  that  the  wavelet  has  R  vanishing  moments: 

r  f  xPit)  dt  =  {jy¥^\0)  =  0,  r  =  0, 1, ...,/?-  1. 

oo 

This  vanishing  moment  property  has  been  exploited  in  applications  involving  the 
implementation  of  linear  operators  [24]  as  well  as  in  image  coding.  In  the  context 
of  this  work,  we  will  provide  evidence  to  suggest  that  this  property  may  also  be 
important  when  wavelet  bases  are  used  in  representations  for  self-similar  signals.  It 
is  important  to  note,  however,  that  the  vanishing  moment  condition  is  not  necessary 
for  regularity.  For  a  more  detailed  discussion  of  necessary  and  sufficient  conditions, 
see,  e.g.,  [25]. 

A  variety  of  useful  wavelet  bases  have  been  constructed  from  filter  formulations  of 
this  type.  In  fact,  this  approach  has  been  extremely  useful  in  designing  orthonormal 
wavelets  with  compact  support,  i.e.,  wavelets  for  which 

m  =  0,  \t\  >  T 

^For  a  further  discussion  of  dilation  equations,  see,  e.g.,  [16]. 
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for  some  0  <  T  <  oo.  This  is  a  consequence  of  the  natural  correspondence  between 
compactly-supported  wavelets  and  the  extensively  developed  theory  of  finite  impulse 
response  (FIR)  digital  filters.  A  more  detailed  discussion  of  relationships  between 
wavelet  theory  and  filter  bank  theory  can  be  found  in  [17]. 


2.2.6  Examples 

In  this  section,  we  briefly  review  some  standard  examples  of  wavelet  bases.  Thus 
far,  we  have  discussed  only  one  example,  the  wavelet  basis  corresponding  to  the  ideal 
bandpass  wavelet  (2.7).  This  basis  has  excellent  frequency  localization  properties,  but 
very  poor  time-domain  localization.  Indeed,  the  corresponding  wavelet  '^{t)  decaj’^s 
only  like  1/t  for  large  t,  and  the  QMF  filters  h[n]  and  5[n]  decay  only  like  1/n  for 
large  n.  More  seriously,  this  basis  is  unrealizable. 

At  the  other  extreme,  consider  a  Haar-based  multiresolution  analysis  in  which  the 
approximations  at  resolution  2"*  are  piecewise  constant  on  intervals  of  length  2“"*. 
Here  the  scaling  function  is  given  by 


<Kt)  =  I 


1 

0 


0  <t  <  1 
otherwise 


and  the  corresponding  wavelet  is 


1 


ip{t)  =  ^ 


-1 


0<f  <  1/2 

l/2<t<l  . 


0  otherwise 


This  analysis  is  realizable  and  exhibits  excellent  time  localization  but  very  poor  fre¬ 
quency  localization  due  to  the  abrupt  time-domain  transitions  of  the  approximations. 
Indeed,  '^'(u;)  falls  off  only  like  l/u;  for  a;  — »  oo. 

More  generally,  we  can  consider  the  family  of  Battle-Lemarie  wavelet  bases  [18]  [1]. 
These  bases  may  be  derived  from  a  multiresolution  analysis  based  upon  orthogonalized 
Pth-order  spline  functions.  For  these  bases,  the  corresponding  scaling  function  is 
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given  by 

For  example,  the  first-order  (P  =  1)  Battle-Lemarie  multiresolutiou  analysis  corre¬ 
sponds  to  piecewise-Iinear  but  continuous  signal  approximations.  In  this  context,  it 
is  trivial  to  show  that  the  Haar-based  wavelet  basis  we  have  discussed  corresponds  to 
the  case  P  =  0.  Similarly,  using  an  argument  based  on  the  central  limit  theorem,  it  is 
possible  to  show  that  the  bandpass  wavelet  basis  corresponds  to  P  — ♦  oo.  The  Battle- 
Lemarie  bases  have  very  reasonable  localization  properties:  they  are  characterized  by 
exponential  decay  in  the  time  domain  and  decay  like  in  the  frequency  do¬ 

main.  Hence  while  they  are,  strictly-speaking,  unrealizable,  the  exponential  decay 
property  ensures  that  good  approximations  may  be  realized  via  truncation. 

Daubechies  has  designed  an  important  class  of  compactly-supported  wavelet  bases 
[1]  based  upon  discrete-time  FIR  filters.  In  addition  to  fulfilling  a  practical  require¬ 
ment  of  having  finite-extent  basis  functions,  these  bases  exhibit  good  localization  in 
both  time  and  frequency.  The  Pth-order  Daubechies  basis  is  characterized  b}'^  QMF 
filters  /i[n]  and  g[n]  of  length  2P  for  R  —  1,2,...,  where  the  case  P  =  1  corresponds 
to  the  Haar-based  wavelet  basis.  Moreover,  the  basis  functions  are  maxiTnally-regular, 
in  the  sense  that  they  have  the  maximum  number  of  vanishing  moments  (P)  for  a 
given  order. 

In  general,  the  development  of  other  families  of  wavelet-based  multiresolution 
analyses  continues  to  receive  considerable  attention  in  the  literature,  as  described  in, 
e.g.,  [17]. 

2.2.7  Non-Dyadic  Orthonormal  Wavelet  Bases 

While  we  have  focussed  largely  upon  dyadic  wavelet  bases,  for  which  the  dilation 
and  translation  increments  are  a  =  2  and  6  =  1,  there  are  many  other  non-dyadic 
choices.  In  many  applications,  including  those  within  the  context  of  this  thesis,  such 
generalizations  are  potentially  very  useful  particularly  for  1  <  o  <  2.  This  is  because 
these  correspond  to  an  analysis  with  finer  frequency  resolution  on  the  logarithmic 


?  {u  +  27rA:)2(^+i) 
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frequency  scale.  For  instance,  to  have  flexibility  <  choosing  from  among  a  family  of 
bases  corresponding  to  the  lattice 


a  =  (L  +  l)/I 
b  =  L, 


where  L  =  1, 2,  ...  is  a  parameter,  would  be  highly  convenient.  An  at  least  concep¬ 
tually  useful  class  of  such  bases  arises  out  of  a  generalization  of  the  bandpass  basis 
defined  by 


VL  TT  <  jw|  <  - - - TT 

<  L 

0  otherwise 


where  the  case  L  =  1  corresponds  to  the  usual  bandpass  basis.  It  is  a  straightfonvard 
exercise  in  analysis  to  verify  that  for  each  L  the  corresponding  set  {ipn{t)},  for  which 


is  complete  and  orthonormal.  Unfortunately,  however,  the  poor  time-domain  local¬ 
ization  of  these  bases  considerably  reduces  their  practical  value. 
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Chapter  3 


Statistically  Self-Similar  Signals 


Some  of  the  most  prevalent  forms  of  fractal  geometry  in  nature  arise  out  of  statistical 
scaling  behavior  in  the  underlying  physical  phenomena.  In  this  chapter,  we  study  an 
important  class  of  statistically  scale-invariant  or  self-similar  random  processes  known 
as  1//  processes.  These  empirically  defined  processes,  in  particular,  model  a  wide 
range  of  natural  signals. 

In  first  half  of  this  chapter  we  first  review  the  empirical  properties  of  1//  pro¬ 
cesses  and  a  traditional  mathematical  model  for  1//  behavior  based  on  the  fractional 
Brownian  motion  framework  of  Mandelbrot  and  Van  Ness  [26].  We  then  introduce 
and  study  a  new  and  potentially  important  mathematical  characterization  for  1  // 
processes.  The  novelty  and  power  of  this  characterization  is  its  basis  in  the  frequency 
domain,  which  admits  a  broader  range  of  Fourier  tools  in  the  analysis  of  1//  pro¬ 
cesses.  In  addition,  we  are  able  to  show  that  our  characterization  includes  the  models 
of  Mandelbrot  and  Van  Ness,  yet  appears  to  avoid  some  of  their  limitations. 

The  latter  half  of  the  chapter  develops  models  for  the  more  broadly  defined  class 
of  nearly-l/f  models,  which  constitute  equally  useful  models  for  many  natural  sig¬ 
nals.  For  completeness,  we  first  review  some  well-known  ARMA-based  constructions 
for  nearly-1//  processes.  However,  the  principal  focus  in  this  section  is  on  devel¬ 
oping  some  new,  powerful  and  efficient  wavelet-based  nearly-1//  models.  Using  our 
frequency-based  characterization  of  1//  processes,  we  are  able  to  show  that  a  rather 
broad  class  of  wavelet  bases  yield  Karhunen-Loeve-like  expansions  for  nearlj'^-l// 
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processes.  As  a  consequence,  it  is  reasonable  to  model  1//  processes  as  orthonor¬ 
mal  wavelet  basis  expansions  in  terms  of  uncorrelated  coefficients.  This  suggests 
that  wavelet-based  analysis  of  l//-type  behavior  is  not  only  convenient,  but,  in  some 
sense,  statistically  optimal.  In  fact,  in  Chapter  4,  we  show  how  these  wavelet-based 
representations  are  extremely  useful  in  addressing  problems  of  optimum  detection 
and  estimation  involving  1  / /-type  signals. 

We  begin  with  a  rather  universally  accepted  definition.  A  random  process  x{t) 
defined  on  — oo  <  t  <  oo  is  said  to  be  statistically  self-similar  if  its  statistics  are 
invariant  to  dilations  and  compressions  of  the  waveform  in  time.  More  specificallj'^,  a 
random  process  x{t)  is  statistically  self-similar  with  parameter  H  if  for  any  real  o  >  0 
it  obeys  the  scaling  relation 

x{t)  =  a~^x{at)  (3.1) 

'P 

where  =  denotes  equality  in  a  statistical  sense.  For  strict-sense  self-similar  processes, 
this  equality  is  in  the  sense  of  all  finite-dimensional  joint  probability  distributions. 
For  wide-sense  self-similar  processes,  the  equality  may  be  interpreted  in  the  sense  of 
second-order  statistics,  *.e.,  mean  and  covariance  functions.  In  this  latter  case,  the 
self-similaxity  relation  (3.1)  may  be  alternately  expressed  as 

M^{t)  =  E[x{t)]  =  a-^M,{at)  (3.2a) 

Rx{t^s)  =  E[x{t)x{s)]  =  a~‘^^  Rx{at,as).  (3.2b) 

We  will  restrict  our  attention  to  Gaussian  processes,  for  which  the  two  definitions 

are,  of  course,  equivalent.  Furthermore,  we  will  consider  only  zero-mean  processes. 

Even  Gaussian  processes  satisfying  (3.1)  can  exhibit  great  diversity  in  behavior. 
Some  are  stationary,  as  is  the  case  with  the  classical  generalized  process  w{t)  corre¬ 
sponding  to  zero-mean,  stationary,  white  Gaussian  noise.  This  process,  whose  auto¬ 
correlation  function  is  an  impulse,  is  self-similar  with  parameter  H  =  —1/2.  More 
typically,  though,  self-similar  processes  are  nonstationary.  For  example,  the  Wiener 
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process  (Brownian  motion)  z{t)  related  to  w(t)  by^ 


and  extended  to  t  <  0  through  the  convention 


(3.3) 


(3.4) 


for  all  t  is  statistically  self-similar  with  if  =  1/2  and  nonstationary  but,  evidently, 
has  a  stationary  derivative.  As  a  final  example,  the  Gaussian  process 


x{t)  =  |t|^“ 


(3.5) 


is  self-similar  with  parameter  Hq  for  all  values  of  ifo,  is  nonstationary,  and  has  a 
nonstationary  derivative  except  for  H  =  1/2.  In  fact,  when  x{t)  in  (3.5)  is  filtered 
by  virtually  any  non-trivial  linear  time-invariant  filter,  the  output  is  a  nonstationary 
process.  However,  while  most  physical  processes  that  exhibit  self-similarity  are  fun¬ 
damentally  nonstationary,  they  retain  a  stationary  quality  to  them.  For  this  reason, 
processes  such  as  (3.5)  generally  constitute  rather  poor  models  for  such  phenomena. 
By  contrast,  perhaps  the  most  important  class  of  models  for  such  phenomena  are  the 
empirically  characterized  “1  //  processes.” 


3.1  1//  Processes 

The  1  If  family  of  statistically  self-similar  random  processes  are  generally  defined  as 
processes  having  measured  power  spectra  obeying  a  power  law  relationship  of  the 


^Throughout  this  chapter,  integrals  with  respect  to  the  differential  element  w{t)  dt,  where  w{t)  is  a 
stationary  white  Gaussian  noise,  should  be  interpreted  more  precisely  as  integrals  with  respect  to  the 
differential  element  dz{t),  where  z{t)  is  the  corresponding  Wiener  process.  While  it  is  customary  to 
consider  w{t)  to  be  the  derivative  of  2(f),  recall  that  the  non-differentiability  of  2(f)  means  that  M;(f) 
is  its  derivative  only  in  a  generalized  sense.  It  is  for  this  reason  that  an  ordinary  Riemann  integral  is 
technically  inadequate,  and  the  corresponding  Riemann-Stieltjes  is  required.  Nevertheless,  we  retain 
the  notation  w(t)  dt  for  conceptual  convenience. 
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form 


r>j 


(3.6) 


s.M 


for  some  spectral  parameter  7  related  to  H  according  to 


7  =  2H  +  1. 


(3.7) 


Generally,  the  power  law  relationship  (3.6)  extends  over  several  decades  of  frequency. 
While  data  length  generally  limits  access  to  spectral  information  at  lower  frequencies, 
data  resolution  limits  access  to  spectral  content  at  higher  frequencies.  Nevertheless, 
there  are  many  examples  of  phenomena  for  which  arbitrarily  large  data  records  justify 
a  1//  spectrum  of  the  form  (3.6)  over  all  accessible  frequencies.  However,  (3.6)  is  not 
integrable  and,  hence,  strictly  speaking,  does  not  constitute  a  valid  power  spectrum 
in  the  theory  of  stationary  random  processes.  As  a  consequence,  there  have  been 
numerous  attempts  to  attach  an  interpretation  to  such  spectra  based  on  notions  of 
generalized  spectra  [27]  [26]  [28]  [29]. 

As  a  consequence  of  their  inherent  self-similarity,  the  sample  paths  of  1  //  processes 
are  typically  fractals,  and  have  a  fractal  dimension  in  the  sense  of  Mandelbrot  [30] 
which  characterizes  their  roughness.  In  fact,  for  the  fractional  Brownian  motion 
models  for  1/f  processes  we  discuss  in  Section  3.1.1,  the  dependence  of  the  fractal 
dimension  D  on  the  self-similarity  parameter  H  can  be  derived  analytically.  More 
generally,  the  relationship  between  D  and  H,  whereby  increasing  the  parameter  H 
yields  a  decrease  in  the  fractal  dimension  D,  may  be  obtained  empirically.  This  is 
intuitively  reasonable;  an  increase  in  H  corresponds  to  an  increase  in  7,  which,  in 
turn,  reflects  a  redistribution  of  power  from  high  to  low  frequencies  and  leads  to 
sample  functions  that  are  increasingly  smooth  in  appearance.  Fig.  3-1  illustrates 
some  sample  paths  of  1//  processes  corresponding  to  various  values  of  7.  Note  that 
what  we  have  plotted  are,  in  some  sense,  bandpass  filtered  versions  of  the  sample 
functions,  since  the  finite  data  length  constrains  the  lowest  accessible  frequency  and 
the  discretization  of  the  time-axis  constrains  the  highest  accessible  frequency. 

A  truly  enormous  and  tremendously  varied  collection  of  natural  phenomena  ex- 
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Figure  3-1:  Sample  paths  oflff  processes  corresponding  to  different  values  of  7. 
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hibit  l//-type  spectral  behavior  over  many  decades  of  frequency.  A  partial  list  in¬ 
cludes  (see,  e.g.,  [28]  [31]  [32]  [30]  [33]  [34]  and  the  references  therein); 

-  geophysical  time  series  such  as  variation  in  temperature  and  rainfall  records, 
measurements  of  oceanic  flows,  flood  level  variation  in  the  Nile  river,  wobble 
in  the  Earth’s  axis,  frequency  variation  in  the  Earth’s  rotation,  and  sunspot 
variations; 

-  economic  time  series  such  as  the  Dow  Jones  Industrial  Average; 

-  physiological  time  series  such  as  instantaneous  heart  rate  records  for  healthy 
patients,  EEG  variations  under  pleasing  stimuli,  and  insulin  uptake  rate  data 
for  diabetics; 

-  biological  time  series  such  as  voltages  across  nerve  and  synthetic  membranes; 

-  electromagnetic  fluctuations  such  as  in  galactic  radiation  noise,  the  intensity  of 
light  sources,  and  flux  flow  in  superconductors; 

-  electronic  device  noises  in  fleld  effect  and  bipolar  transistors,  vacuum  tubes, 
and  Schottky,  Zener  and  tunnel  diodes; 

-  resistance  fluctuations  in  metal  film,  semiconductor  films  and  contacts,  germa¬ 
nium  filaments  in  carbon  and  aqueous  solution,  thermocells,  and  concentrations 
cells; 

-  frequency  variation  in  hourglasses,  quartz  crystal  oscillators,  atomic  clocks,  and 
superconducting  cavity  resonators; 

-  man-induced  phenomena  including  variations  in  traffic  flow  and  amplitude  and 
frequency  variation  in  Western,  African,  Asian  and  Indian  music,  both  modern 
and  traditional; 

-  generation  of  perceptually-pleasing  physiological  stimuli,  such  as  artificial  music 
and  breezes; 
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-  burst  errors  on  communication  channels; 


-  texture  variation  in  natural  terrain,  landscapes,  and  cloud  formations. 

While  7  «  1  in  many  of  these  examples,  more  generally  0  <  7  <  2.  However, 
there  are  many  examples  of  phenomena  in  which  7  lies  well  outside  this  range.  For 
7  >  1,  the  lack  of  integrability  of  (3.6)  in  a  neighborhood  of  the  spectral  origin  re¬ 
flects  the  preponderance  of  low-frequency  energy  in  the  corresponding  processes.  This 
phenomenon  is  termed  the  infrared  (IR)  catastrophe.  For  many  physical  phenomena, 
measurements  corresponding  to  very  small  frequencies  show  no  low-frequency  roll  oflF, 
which  is  usually  understood  to  reveal  an  inherent  nonstationarity  in  the  underlying 
process.  Such  is  the  case  for  the  Wiener  process  discussed  earlier.  For  7  <  1,  the  lack 
of  integrability  in  the  tails  of  the  spectrum  reflects  a  preponderance  of  high-frequency 
energy  and  is  termed  the  ultraviolet  (UV)  catastrophe.  Such  behavior  is  familiar  for 
generalized  Gaussian  processes  such  as  stationary  white  Gaussian  noise  and  its  usual 
derivatives. 

An  important  property  of  1  //  processes  is  their  persistent  statistical  dependence. 
Indeed,  the  generalized  Fourier  pair  [35] 

1 

2r(7)cos(77r/2) 

\'alid  for  7  >  0  but  7  7^  1, 2, 3,  . . .  ,  suggests  that  the  autocorrelation  Rx{t)  associated 
with  the  spectrum  (3.6)  for  0  <  7  <  1  is  characterized  by  slow  decay  of  the  form 

Rx{r)  -  Irp-'. 

This  power  law  decay  in  correlation  structure  distinguishes  1//  processes  from  many 
traditional  models  for  time  series  analysis.  Indeed,  the  well-studied  family  of  au¬ 
toregressive  moving- average  (ARMA)  models  have  a  correlation  structure  invariably 
characterized  by  exponential  decay.  As  a  consequence,  ARMA  models  are  generally 
inadequate  for  capturing  long-term  dependence  in  data. 

Perhaps  the  most  important  families  of  1//  processes  are  those  which  are  non- 


Gaussian.  Indeed,  a  number  of  rich  and  interesting  examples  of  non-Gaussian  self¬ 
similar  behavior  can  be  constructed  by  exploiting  the  theory  of  stable  distributions 
[26]  [36]  [37]  [38].  Nevertheless,  Gaussian  models  are  generally  applicable  in  a  broad 
range  of  contexts,  and  are  analytically  highly  tractable.  For  these  reasons,  we  focus 
principally  on  Gaussian  1//  processes  in  this  thesis.  In  the  next  section,  we  review 
what  are  perhaps  the  most  popular  mathematical  models  for  Gaussian  1//  processes: 
fractional  Brownian  motion  and  fractional  Gaussian  noise. 

3.1.1  Fractional  Brownian  Motion  and  Fractional  Gaussian 
Noise 

It  is  generally  agreed  [38]  [26]  [30]  that  fractional  Brownian  motion  (fBm)  and  frac¬ 
tional  Gaussian  noise  (fGn)  models  were  first  proposed  by  Kolmogorov,  although  their 
current  popularity  is  undoubtedly  due  to  Mandelbrot  who  independently  derived  the 
theory  with  Van  Ness  in  [26]  and  promoted  their  use  in  numerous  subsequent  publi¬ 
cations  (see,  e.g.,  the  references  in  [30]).  An  extensive  bibliographic  guide  to  various 
subsequent  developments  of  the  theory,  principally  in  the  mathematics  literature,  is 
presented  by  Taqqu  in  [38]. 

In  this  framework,  processes  corresponding  to  1  <  7  <  3,  for  which  there  is 
infinite  low-frequency  power,  are  developed  as  nonstationary  random  processes  having 
finite-power  in  any  finite  time  interval.  These  processes  are  the  fractional  Brownian 
motions,  and  classical  Brownian  motion  is  a  special  case  corresponding  to  7  =  2. 
By  contrast,  processes  corresponding  to  -1  <  7  <  1,  for  which  there  is  infinite 
high-frequency  power,  are  developed  as  generalized  stationary  Gaussian  processes 
corresponding  to  the  derivative  of  a  fractional  Brownian  motion.  These  processes  are 
the  fractional  Gaussian  noises,  and  stationary  white  Gaussian  noise  is  a  special  case 
corresponding  to  7  =  0.  The  theory  does  not  accommodate  the  cases  7  >  3  and 
7  <  —  1.  Furthermore,  the  models  are  degenerate  for  the  cases  7  =  —  1,  7  =  1,  and 
7  =  3. 

A  reasonable  approach  for  developing  1//  models  would  appear  to  be  to  consider 
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driving  stationary  white  Gaussian  noise  through  a  linear  time-invariant  system  with 
impulse  response 

where  r(-)  is  the  gamma  function.  Indeed,  (3.9)  has  the  generalized  Laplace  transform 
[39] 

~  sH+l/2' 

which  suggests  that  if  the  input  has  spectral  density  (7|,  the  output  will  have  a  power 
spectrum,  in  some  sense,  of  the  form  (3.6)  where  7  is  given  via  (3.7).  As  we  will 
discuss  in  Chapter  7,  (3.9)  represents  an  example  of  a  linear  jointly  time-  and  scale- 
invariant  system  of  degree  if  4- 1/2.  However,  the  system  defined  via  (3.9)  is  unstable 
except  for  the  degenerate  case  H  =  —1/2.  Consequently,  the  convolution 

x{t)  =  v{t)  *  w{t)  =  Y(H'\'y'2)  Lj*  ~ 
is  not  well-defined. 

In  developing  their  1  //  model,  Barnes  and  Allan  [40]  addressed  this  dilemma  by 
keying  the  integration  in  (3.10)  to  the  time  origin,  defining  their  self-similar  process 

by  ^ 

where  this  definition  is  extended  for  t  <  0  through  the  convention  (3.4).  It  is  interest¬ 
ing  to  remark  that  (3.11)  is  familiar  in  mathematics  as  the  Riemann-Liouville  integral 
of  w{t)  over  the  interval  0  <  r  <t.  In  fractional  calculus  theory  [41],  it  often  is  used 
to  define  the  fractional  integral  of  w{t)  of  order  A  =  i?  -P  1/2  >  0,  usually  denoted 

*(*)  = 

The  resulting  process  is  well-defined,  satisfies  z(0)  =  0,  and  is  statistically  self- 
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similar  with  parameter  H,  i.e., 


Ri 


finax(t,s)  ti  1  in  o  IT 

(t,s)=  /  (max(t,  s)  —  r)^  dr  =  Rx{at,as)  (3.12) 
Jo 


for  a  >  0.  However,  the  Barnes- Allan  process  constitutes  a  rather  poor  model  for  1// 
behavior.  In  fact,  it  lacks  any  kind  of  stationary  quality.  For  instance,  the  increment 
process 


Ax{t-,e)  = 


x{t  +  e)  —  x{t) 
£ 


(3.13) 


while  statistically  self-similar,  satisfying 


Ax(t;  e)  =  a  ^^Ax{at;  ae) 


(3.14) 


for  every  £  >  0,  is  nonstationary.  Consequently,  one  cannot  associate  a  stationary 
generalized  derivative  with  the  process.  In  effect,  the  underlying  problem  is  that  the 
Barnes- Allan  process  places  too  much  emphasis  on  the  time  origin  [26]. 

Fractional  Brownian  motion  represents  a  very  useful  refinement  of  the  Barnes- 
Allan  process.  Specifically,  fractional  Brownian  motion  is  a  nonstationary  Gaussian 
self-similar  process  x(t)  also  satisfying  a;(0)  =  0,  but  defined  in  such  a  way  that 
its  corresponding  increment  process  Ax(t;  e)  is  self-similar  and  stationary  for  every 
£  >  0.  Imposing  these  constraints  on  increments  of  fractional  Brownian  motion  leads 
to  a  comparatively  better  model  for  1/f  behavior. 

A  convenient  though  specialized  definition  of  fractional  Brownian  motion  is  given 
by  [42] 


x{t)  = 


1 


T{H +  1/2) 


f  \t  —  w{t)  dr 

Jo 


(3.15) 


for  0  <  Ff  <  1,  where  w{t)  is  a  zero-mean,  stationary  white  Gaussian  noise  process 
with  unit  spectral  density.  Again,  for  f  <  0,  x{t)  is  defined  through  the  convention 
(3.4).  Note  that  with  H  =  1/2,  (3.15)  specializes  to  the  Wiener  process  (3.3),  t.e., 
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classical  Brownian  motion. 

Fractional  Brownian  motions  are,  in  fact,  fractals.  Specifically,  sample  functions 
of  fractional  Brownian  motions  whose  self-similarity  parameters  lie  in  the  range  0  < 
H  <  1  {i.e.,  1  <  7  <  3)  have  a  fractal  (Hausdorff-Besicovitch)  dimension  given  by 
[30] 

D  =  2-H 

that  gives  a  quantitative  measure  of  their  roughness. 

The  correlation  function  for  fractional  Brownian  motion  can  be  readily  derived  as 

s)  =  E  [a:(t)a;(s)]  =  ^  -  [t  -  (3.16) 

where 

4=Vari(l)  =  r(l-2if)^2^)|^,  (3.17) 

from  which  it  is  straightforward  to  verify  that  the  process  is  statistically  self-similar 
with  parameter  H. 

It  is  likewise  straightforward  to  verify  that  the  normalized  increments  of  fractional 
Brownian  motion  are  stationary  and  self-similar,  and  have  the  autocorrelation 


R^x{r\s) 


A 


E  [Ax(t;  €)Ax{t  —  r;  s)] 


^2  ^2H-2 
aifS 


2H 


+  1 


-2 


(3.18) 


At  large  lags  (|r|  >■£:),  the  correlation  is  asymptotically  given  by 


Rc,,{r)  »  -  1)1x1^"  ^ 


(3.19) 


Letting  s  — >  0,  and  defining 


(3.20) 
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we  can  reason  that  fractional  Brownian  motion  has  the  generalized  derivative  [42] 

At)  =  |x(()  =  hmAx((;£)  =  r(g4l/2)  I-J  “  (3-21) 

which  is  termed  fractional  Gaussian  noise.  Note  that  (3.21)  is  precisely  a  convolution 
of  the  form  (3.10),  for  which  we  now  have  an  interpretation.  Furthermore,  from  this 
observation  we  deduce  that  the  derivative  process  x'(t)  is  stationary  and  statistically 
self-similar  with  parameter  H'. 

From  (3  1'  )  it  is  apparent  that  the  character  of  x'(t)  depends  strongly  on  the 
value  of  H.  Note  that  the  nuht  side  of  (3.19)  has  the  same  algebraic  sign  as  Ff  —  1/2. 
Hence,  for  1/2  <  if  <  1  this  derivative  process  exhibits  long-term  dependence,  i.e., 
persistent  correlation  structure.  For  H  =  1/2,  this  derivative  is  the  usual  stationary 
white  Gaussian  noise,  which  has  no  correlation,  while  for  0  <  FT  <  1/2,  the  deriva¬ 
tive  exhibits  persistent  anti-correlation.  For  1/2  <  Ff  <  1,  x'{t)  is  zero-mean  and 
stationary  with  covariance 

Rx'{r)  —  E[x'{t)x'{t  -  r)]  =  a|(Ff'  +  -f  l)lrp^'  (3.22) 


and  we  note  that  the  generalized  Fourier  pair  (3.8)  suggests  that  the  corresponding 
power  spectral  density  of  the  derivative  process  can  be  expressed,  for  u;  7^  0,  as 


Sx'{uj)  = 


(3.23) 


where 


Y  =  2Ff'-f  1. 


A  conceptually  useful  synthesis  for  fractional  Brownian  motion  is  depicted  in 
Fig.  3-2.  In  particular,  driving  a  linear  time-invariant  system  with  impulse  response 

with  stationary  white  Gaussian  noise  w{t)  generates  a  fractional  Gaussian  noise  x'{t), 
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w(t) 


u(t) 

x'(t)  ^ 

0 

T(H-l/2) 

-  x(t) 


Figure  3-2:  Synthesis  of  fractional  Brownian  motion  x{t)  in  terms  of  fractional  Gaus¬ 
sian  noise  x'(t)  and  stationary  white  Gaussian  noise  w{t). 


from  which  fractional  Brownian  motion  x{t)  is  obtained  by  routine  integration: 


x{t)  =  f  x'{t)dt. 
Jo 


The  fractional  Brownian  motion  framework  provides  a  useful  construction  for 
some  models  of  l//-type  spectral  behavior  corresponding  to  spectral  exponents  in 
the  range  —  1  <  7  <  1  and  1  <  7  <  3.  In  fact,  they  uniquely  model  certain  classes 
of  1  If  behavior.  One  can  show,  for  instance,  that  for  0  <  <  1 ,  fractional  Brow¬ 

nian  motion  constitutes  the  only  statistically  self-similar,  zero-mean,  mean-square 
continuous,  finite-variance,  Gaussian  random  process  satisfying  a;(0)  =  0  and  having 
stationary  increments.  While  these  are  somewhat  restrictive  conditions,  this  frame¬ 
work  has,  in  general,  become  a  popular  one  for  modeling  a  variety  of  phenomena  with 
1//  behavior — see,  e.g.,  [30]  [32]  [43]  [42].  However,  fractional  Brownian  motion  and 
fractional  Gaussian  noise  are  not  the  only  models  for  1//  behavior,  even  within  the 
respective  parameter  ranges  —  1  <  7  <  1  and  1  <  7  <  3.  In  fact,  in  some  cases  they 
constitute  rather  poor  models  for  1//  behavior. 

One  unsatisfying  characteristic  of  fractional  Brownian  motion  is  its  pronounced 
time  origin.  Indeed,  fractional  Brownian  motion  satisfies  not  only  a:(0)  =  0,  but  also 
power  law  growth  in  variance  as  a  function  of  time,  i.e., 

'Varx{t)  =  ajf 

In  modeling  many  physical  phenomena  having  empirical  spectra  corresponding  to  7 
in  this  range,  the  notion  of  such  a  time  origin  is  not  only  rarely  observed,  but  rather 
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unnatural  as  well. 

Additionally,  the  fractional  Brownian  motion  framework  has  a  number  of  limita¬ 
tions.  In  particular,  it  does  not  lead  to  useful  models  for  1//  processes  corresponding 
to  7  <  —1,  7  >  3,  and  perhaps  the  most  important  and  ubiquitous  case,  7  =  1.  In¬ 
deed,  for  7  =  3  (/f  =  1),  fractional  Brownian  motion  as  defined  by  (3.15)  degenerates 
to  a  process  whose  sample  paths  are  all  lines  through  the  origin,  mz., 

x(t)  =  lt|x(l). 

For  7  =  1  {H  =  0),  fractional  Brownian  motion  degenerates  to  the  trivial  process 


x(t)  =  0. 


More  generally,  choosing  /f  <  0  in  (3.15)  leads  to  processes  that  are  not  mean-square 
continuous,  while  choosing  H  >  I  in  (3.15)  leads  to  processes  whose  increments  are 
not  stationary  [26]  [37]. 

In  the  next  section,  we  consider  a  more  general  but  non-constructive  model  for 
1//  processes  that  includes  both  fractional  Brownian  motions  and  fractional  Gaus¬ 
sian  noises,  yet  appears  to  avoid  some  of  the  restrictions  imposed  by  the  fractional 
Brownian  motion  framework. 

3.1.2  A  New  Mathematical  Characterization  of  1//  Pro¬ 
cesses 

The  notion  that  measurements  of  spectra  for  physical  processes  can  only  be  obtained 
over  a  range  of  frequencies  governed  by  data  length  and  resolution  limitations  suggests 
a  potentially  useful  approach  for  defining  1//  processes.  In  particular,  it  would  appear 
reasonable  to  define  1//  processes  in  terms  of  their  characteristics  under  bandpass 
filtering.  As  a  consequence,  we  propose  the  following  definition. 

Definition  3.1  A  wide-sense  statistically  self-similar  zero-mean  random  process  x(t) 
shall  be  said  to  be  a  1/f  process  if  when  x(t)  is  filtered  by  an  ideal  bandpass  filter  with 
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frequency  response 


Biico)  =  { 


1  ;r  <  [wj  <  27r 
0  otherwise 


(3.24) 


the  resulting  process  yi{t)  is  wide-sense  stationary  and  has  finite  variance. 


In  fact,  the  choice  of  passband  tt  <  |w|  <  27r  in  this  definition  is  rather  arbitrarJ^ 
More  generally,  replacing  this  passband  with  any  other  passband  that  contains  neither 
u;  =  0  nor  a;  =  oo  constitutes  an  equivalent  characterization  for  this  class  of  self- 
similar  processes.  Furthermore,  one  might  speculate  that  choosing  an  ideal  bandpass 
filter  in  this  definition  is  not  critical.  In  particular,  it  might  suffice  to  choose  an}'’ 
filter  whose  frequency  response  B(u)  with  sufficient  decay  as  a;  — >  0  and  u;  — ♦  0. 
Nevertheless,  our  choice  is  at  least  a  convenient  as  we  shall  see.  The  appeal  of 
Definition  3.1  as  a  characterization  for  1/f  processes  is  its  basis  in  the  frequency- 
domain.  As  a  consequence,  this  allows  us  to  extend  the  well-established  tools  of 
Fourier  analysis  to  this  important  class  of  nonstationary  processes.  In  turn,  this 
allows  us  to  derive  an  number  of  new  properties  of  1  //  processes  in  a  highly  efficient 
manner. 

The  following  theorem  justifies  designating  processes  satisfying  Definition  3.1  as 
1/f  processes,  and  leads  to  an  important  interpretation  of  the  spectrum  (3,6)  for  1/f 
processes.  A  detailed  but  straightforward  proof  is  provided  in  Appendix  B.l. 


Theorem  3.2  A  1/f  process  x{t),  when  filtered  by  an  ideal  bandpass  filter  with  fre¬ 
quency  response 


B{u)  = 


1 

0 


<  1^1  ^ 

otherwise 


(3.25) 


for  arbitrary  0  <  ui  <  uu  <  oo,  yields  a  wide-sense  stationary  random  process  y{t) 
with  finite  variance  and  having  power  spectrum 


Sy{u;) 


al/\uj\'^  ijJh  <  \uj\  <  uju 
0  otherwise 


(3.26) 


58 


for  some  >  0,  and  where  the  spectral  exponent  7  is  related  to  the  self- similarity 
parameter  H  according  to  (3.1). 

An  important  question  that  must  be  addressed  concerns  whether  there  exist  any 
non- trivial  random  processes  satisfying  Definition  3.1.  The  following  theorem  consti¬ 
tutes  such  an  existence  proof  by  verifying  that  Definition  3.1  is  non-degenerate  for 
at  least  some  values  of  7.  In  particular,  it  establishes  that  it  is  possible  to  construct 
families  of  Gaussian  processes  that  satisfy  this  definition.  A  straightforward  proof  is 
provided  in  Appendix  B.2. 

Theorem  3.3  Fractional  Brownian  motions  corresponding  to  0  <  H  <  1  and  the 
associated  fractional  Gaussian  noises  are  1//  processes. 

We  remark  that,  based  on  our  discussion  of  Section  3.1.1,  an  immediate  corollaiy  is 
that  the  Wiener  process  and  stationary  white  Gaussian  noise  are  also  Iff  processes. 
However,  the  Barnes-Allan  process  we  described  at  the  outset  of  Section  3.1.1  is 
not  a  1//  process  in  the  sense  of  Definition  3.1.  This  is  to  be  expected,  given  the 
shortcomings  of  the  Barnes-Allan  process  in  modeling  1  / /-type  behavior. 

The  other  question  that  naturally  arises  concerns  whether  there  are  aiy  other 
Gaussian  Iff  processes  besides  those  of  Theorem  3.3.  For  instance,  is  it  possible  to 
construct  non-trivial  Gaussian  processes  that  satisfy  Definition  3.1  for  values  of  H 
outside  0  <  ff  <  1?  And,  are  there  other  Gaussian  processes  satisfying  this  definition 
for  0  <  <  1?  The  very  recent  work  of  Ramanathan  and  Zeitouni  [44]  indirectly 

suggests  that  the  answer  may  perhaps  be  negative,  although  they  do  not  prove  such 
a  result.  In  effect,  these  authors  show  that  if  we  were  to  replace  the  bandpass  filter 
(3.24)  in  Definition  3.1  with  a  roughly  bandpass  filter  whose  frequency  response  is 
differentiable,  has  a  simple  zero  at  w  =  0,  and  decays  sufficiently  quickly  as  w  — >  00, 
then  the  definition  uniquely  characterizes  fractional  Brownian  motion.  However,  the 
constraints  on  the  filter  they  consider  are  overly  restrictive  to  answer  our  questions. 
Furthermore,  it  may  be  that  the  technical  definition  of  a  random  process  they  consider 
is  too  narrowly  chosen  to  accommodate  1//  behavior. 
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In  any  event,  a  practical  difficulty  with  both  the  fractional  Brownian  motion  frame¬ 
work  and  our  new  frequency-based  characterization  for  Iff  processes  is  that  while 
mathematically  well-defined,  neither  is  analytically  convenient  in  many  contexts.  In 
fact,  there  are  many  very  basic  signal  processing  problems  that  are  effectively  in¬ 
tractable  using  these  models. 

To  address  this  limitation,  we  next  consider  some  more  general  classes  oil!  f -like 
models.  While  these  models  typically  do  not  give  rise  to  exactly-l/f  spectra,  they 
give  rise  to  spectra  that  are  nearly-1  / /.  As  we  shall  see,  these  processes  retain  most  of 
the  fundamental  characteristics  of  1//  processes,  yet  are  considerably  more  amenable 
to  analysis. 


3.2  Nearly-1//  Processes 

Perhaps  the  best-known  class  of  nearly-1//  processes  have  been  those  based  upon 
a  generalized,  infinite-order  autoregressive  moving- average  (ARMA)  framework.  We 
shall  review  two  such  formulations  before  developing  a  wavelet-based  model  for  1//- 
like  behavior  that  constitutes  the  principle  contribution  of  the  chapter. 

3.2.1  ARMA  Models 

There  have  been  a  variety  of  attempts  to  exploit  a  generalized  autoregressive  moving- 
average  framework  in  modeling  1//  processes.  Perhaps  the  earliest  such  framework, 
based  on  a  “distribution  of  time  constants”  formulation,  arose  in  the  physics  literature 
and  dates  back  at  least  to  the  work  of  Bernamont  [45].  However,  it  was  really  the 
seminal  paper  of  van  der  Ziel  [46]  that  sparked  substantial  interest  in  this  approach, 
and  much  subsequent  development. 

Van  der  Ziel’s  basic  approach  was  to  model  a  1//  process  as  the  weighted  super¬ 
position  of  an  infinite  number  of  uncorrelated  random  processes,  each  governed  by  a 
distinct  characteristic  time-constant  1/a  >  0.  Each  of  these  random  processes  has 
correlation  function 

R,(t)  = 
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for  0  <  7  <  2  is  chosen,  the  resulting  spectrum  (3.27)  is  1//,  i.e., 

This  mathematical  identity  suggests  a  useful  and  practical  approach  to  modeling 
l//-iike  behavior  using  the  superposition  of  a  countable  collection  of  single  time- 
constant  processes  whose  poles  are  appropriately  distributed.  In  fact,  the  density 
(3.28)  implies  that  the  poles  should  be  uniformly  distributed  along  a  logarithmic 
frequency  axis.  The  resulting  process  x{t)  synthesized  in  this  manner  then  has  a 
nearly- 1//  spectrum  in  the  following  sense:  when  x{t)  is  filtered  by  any  bandpass 
filter  of  the  form  (3.25)  the  result  is  a  stationary  process  whose  spectrum  within 
the  passband  is  1  //  with  superimposed  ripple  that  is  uniform-spaced  and  of  uniform 
amplitude  on  a  log-log  frequency  plot. 
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As  A  is  chosen  closer  to  unity,  the  pole  spacing  decreases,  which  results  in  a  decrease 
in  both  the  amplitude  and  spacing  of  the  spectral  ripple  on  a  log-log  plot. 

Note  that  we  may  interpret  the  1//  model  that  results  from  this  discretization 
as  an  infinite-order  ARMA  process.  That  is,  x{t)  can  be  viewed  as  the  output  of  a 
rational  LTI  system  with  a  countably  infinite  number  of  both  poles  and  zeros  driven 
by  a  stationary  white  noise  source.  There  has  been  a  substantial  body  of  literature 
that  has  attempted  to  exploit  this  distribution-of-time-constants  model  in  an  effort 
to  explain  the  ubiquity  of  1//  spectra  in  nature.  In  essence,  [47]  [48]  [49]  construct 
mathematical  arguments  to  the  efiect  that  1  //  spectra  are  the  result  of  large,  complex 
sj'stems  in  nature  favoring  scale-invariant  time-constant  distributions  of  the  form 
(3.28). 

Somewhat  more  recently,  Keshner  [28]  [31]  developed  an  alternative  ARMA-based 
model  for  1  //-like  behavior  from  an  engineering  perspective.  This  approach,  which 
has  also  received  considerable  attention  in  the  literature,  is  based  on  the  obseiwation 
that  an  infinite-length  continuous  RC  transmission  line  when  driven  with  a  stationary 
white  noise  current  i(t)  yields  a  measured  voltage  v{t)  whose  power  spectrum  is  of 
the  form  (3.6)  for  0  <  7  <  2.  That  is,  in  some  sense,  the  impedance  function  of  the 


line  is  of  the  form 


V{s) 

I{S)  S7/2- 

By  considering  an  infinite-length,  lumped-parameter  RC  line  as  an  approximation, 
Keshner  showed  that  this  gave  rise  to  nearly-1//  behavior  in  much  the  same  manner 
as  was  obtained  in  the  van  der  Ziel  model.  It  is  possible  to  interpret  1  //  processes 
obtained  in  this  manner  as  the  result  of  driving  stationary  white  noise  through  an 
LTI  system  with  a  rational  system  function 


r(s) 


n 


m=— oo 


s  -I- 

s-l- A™ 


(3.33) 


which,  in  turn,  leads  to  a  spectrum  of  the  form 


OO 

Sx{u;)  oc  JJ 

m=-“00 


^2  _^^2m+-r' 
-t- A2™ 


(3.34) 


This  nearly-!//  spectrum  has  the  same  properties  as  the  van  der  Ziel  spectrum, 
satisfying  both  (3.31)  and  (3.32).  In  fact,  comparing  the  spectra  (3.34)  and  (3.30), 
we  see  that  the  pole  placement  strategy  for  both  is  identical.  However,  the  zeros  in 
the  two  models  are,  in  general,  distributed  much  differently.  This,  in  turn,  leads  to 
differing  degrees  of  ripple  amplitude  for  a  given  pole  spacing  for  the  two  different 
models  [50]. 

It  is  interesting  to  remark  that  the  ideal  of  using  infinite  lumped  RC  line  to  re¬ 
alize  such  systems  was  independently  developed  in  the  context  of  fractional  calculus. 
Indeed,  Oldham  and  Spanier  [41]  describe  precisely  such  an  implementation  for  frac¬ 
tional  integration  operators  of  the  type  used  in  the  construction  of  the  Barnes- Allan 
process  (3.11). 

The  structure  of  the  system  function  (3.33)  of  Keshner’s  synthesis  filter  provides 
additional  insights  into  l//-like  behavior.  For  example,  the  infinity  of  poles  sug¬ 
gests  that  a  state  space  characterization  of  1//  processes  would  generally  require 
uncountably  many  state  variables,  consistent  with  the  notion  of  long  term  correlation 
structure  in  such  processes. 
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Also,  this  system  function  lends  useful  insight  into  the  limiting  behavior  of  1// 
processes  as  7  — »  0  and  7  1.  Note  that  the  poles  and  zeros  of  (3.33)  lie  along 

the  negative  real  axis  in  the  s-plane.  On  a  logarithmic  scale,  the  poles  and  zeros  are 
each  spaced  uniformly  along  this  half  line.  Furthermore,  in  general,  to  the  left  of  each 
pole  in  the  s-plane  lies  a  matching  zero,  so  that  poles  and  zeros  are  alternating  along 
the  half  line.  However,  for  certain  values  of  7,  pole-zero  cancellation  takes  place. 
In  particular,  as  7  2,  the  zero  pattern  shifts  left  canceling  all  poles  except  the 

limiting  pole  at  s  =  0.  The  resulting  system  is  therefore  an  integrator,  characterized 
by  a  single  state  variable,  and  generates  a  Wiener  process  as  anticipated.  By  contrast, 
as  7  — ♦  0,  the  zero  pattern  shifts  right  canceling  all  poles.  The  resulting  system  is 
therefore  a  multiple  of  the  identity  system,  requires  no  state  variables,  and  generates 
stationary  white  noise  as  anticipated. 

Finally,  note  that  the  model  may  be  interpreted  in  terms  of  a  Bode  plot.  In 
general,  stable,  rational  system  functions  comprised  of  real  poles  and  zeros  are  only 
capable  of  generating  transfer  functions  whose  Bode  plots  have  slopes  that  are  integer 
multiples  of 

20  logio  2  «  6 

dB  per  octave.  However,  a  1//  synthesis  filter  must  fall  off  at 

107logio2  «  37 

dB  per  octave  where  0  <  7  <  2  is  generally  not  an  integer.  To  accommodate  such 
slopes  using  rational  system  functions  requires  an  alternating  sequence  of  poles  and 
zeros  to  generate  a  stepped  approximation  to  a  —37  dB/octave  slope  from  segments 
that  alternate  between  slopes  of  —6  dB/octave  and  0  dB/octave. 

Unfortunately,  neither  of  the  ARMA-based  models  have  been  particularly  useful 
in  addressing  basic  problems  of  detection  and  estimation  involving  Iff  processes. 
However,  both  have  been  used  extensively  as  1//  noise  simulators.  A  discrete-time 
implementation  of  the  van  der  Ziel  model  is  described  in  [51],  Avhile  details  of  a 
discrete-time  implementation  of  Keshner’s  model  due  to  Corsini  and  Saletti  appears 
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in  [52],  A  comparison  of  the  two  approaches  is  presented  in  [50].  In  virtually  all  the 
simulations  of  described  in  this  thesis,  the  Corsini-Saletti  implementation  of  Keshner’s 
model  will  be  used  to  synthesize  1//  processes.  In  particular,  the  Iff  sample  paths 
of  Fig.  3-1  are  obtained  via  this  algorithm. 

3.2.2  Wavelet-Based  Models 

In  this  section,  we  explore  the  relationship  between  orthonormal  wavelet  bases  and 
nearly-1  / /  models.  In  particular,  we  shall  show  that  wavelet  basis  expansions  are  both 
natural  and  convenient  representations  for  processes  exhibiting  1  / /-like  behavior.  Our 
main  result  is  that  orthonormal  wavelet  basis  expansions  play  the  role  of  Karhunen- 
Loeve-like  expansions  for  l//-type  processes.  That  is,  wavelet  basis  expansions  in 
terms  of  uncorrelated  random  variables  constitute  good  models  for  1  //-type  behavior. 

Synthesis 

In  this  section,  we  demonstrate  that  nearly-1//  behavior  may  be  generated  from 
orthonormal  wavelet  basis  expansions  in  terms  of  a  collections  of  uncorrelated  wavelet 
coefficients.  In  particular  we  establish  the  following  theorem,  an  earlier  version  of 
which  appears  in  [53],  and  whose  proof  is  provided  in  Appendix  B.3. 

Theorem  3.4  Consider  any  orthonormal  wavelet  basis  with  Rth-order  regularity  for 
some  R>  1.  Then  the  random  process  constructed  via  the  expansion 

=  (3.35) 

m  n 

where  the  are  a  collection  of  mutually  uncorrelated,  zero-mean  random  variables 
with  variances 

Varx;;*  = 

for  some  parameter  0  <  7  <  2R,  has  a  time-averaged  spectrum 

S,{iv)  =  2-^’”l^(2-’”a;)|2  (3.36) 
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that  is  nearly-lff,  i.e., 

for  some  0  <  <t|  <  cr^  <  oo,  and  has  octave- spaced  ripple,  i.e.,  for  any  integer  k 

|a;r5x(a;)  =  |2*=a;p5^(2*=a;).  (3.38) 

In  Fig.  3-3  we  illustrate  the  time- averaged  spectrum  of  a  process  constructed  in 
the  manner  of  this  theorem  for  7  =  1  using  the  first  order  Battle-Lemarie  wavelet 
basis.  Note  the  characteristic  octave-spaced  ripple.  The  bounding  constants  in  this 
case  correspond  to  cry/cr|  =  1.103. 

The  result  established  by  this  theorem  is  certainly  an  intuitively  reasonable  one 
if,  for  example,  we  view  the  orthonormal  wavelet  decomposition  as  a  generalized 
octave-band  filter  bank  as  described  in  Section  2.2.1.  In  fact,  for  the  case  of  the  ideal 
bandpass  wavelet  basis,  it  can  be  readily  established  from  simple  geometric  arguments 
that  the  tightest  bounding  constants  are 

tt'*' 

=  a^{2ny. 

Note,  too,  the  special  interpretation  that  may  be  derived  from  the  model  for  the 
case  7  =  1,  arguablj^  the  most  prevalent  of  the  l//-type  processes.  Here  the  choice 
of  the  variance  progression 

Vara:;^*  = 

corresponds  to  distributing  power  equally  among  the  detail  signals  at  all  I'esolution 
scales,  since  we  have  for  each  m 

^  r  P^(a;)|'F(2-"‘a;)pda;  =  1.  (3.39) 

ztt  y~oo 
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spectral  density  S(o)) 


Figure  3-3:  The  time-averaged  spectrum  of  a  process  synthesized  from  the  first- 
order  Battle-Lemarie  orthonormal  wavelet  basis.  The  parameters  of  the  nearly-l/f 
spectrum  are  7  =  1  and  cr^/cr|  =  1.103  in  this  case. 
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There  are  two  aspects  of  this  theorem  that  warrant  further  discussion.  First, 
the  nearly-1//  spectrum  (3.36)  is  to  be  interpreted  in  the  same  manner  that  the 
1//  spectrum  (3.6)  is  for  exactly-1//  processes.  That  is,  if  x{t)  is  filtered  by  an 
ideal  bandpass  filter  with  frequency  response  of  the  form  (3.25),  the  output  of  the 
filter  will  have  finite-power  and  correspond  to  a  spectrum  of  the  form  (3.36)  over  the 
passband  ui  <  |a;(  <  uu.  However,  it  is  important  to  emphasize  that  this  spectrum 
is  a  time-averaged  one.  Indeed,  the  output  of  such  a  bandpass  filter  will  not,  in 
general,  be  stationary  in  any  sense,  which  is  a  consequence  of  the  discrete  nature  of 
the  synthesis.  This  behavior  is  in  constrast  to  the  ARMA-based  nearly-1//  processes 
discussed  in  Section  3.2.1,  which,  when  bandpass  filtered,  yield  stationary  processes 
with  nearly-1//  spectra. 

One  approach  to  extending  this  model  so  as  to  incorporate  this  property  of  sta- 
tionarity  is  to  add  phase  jitter  in  the  synthesis  process.  Specifically,  we  may  consider 
randomizing  the  time-origin  of  our  processes  generated  via  (3.35)  by  applying  a  ran¬ 
dom  (positive  or  negative)  delay  to  the  process.  In  fact,  this  is  one  way  of  interpreting 
(3.36)  as  the  generalized  spectrum  of  a  stationary  process.  However,  the  random  pro¬ 
cess  x{t)  constructed  in  this  Avay  is  not  ergodic.  Furthermore,  if  the  coefficients  x'^ 
in  Theorem  3.4  are  chosen  to  be  Gaussian,  x{t)  will  be  necessarily  a  Gaussian  pro¬ 
cess,  but  x{t)  will  not.  For  these  reasons,  the  phase-jittered  process,  while  perhaps 
useful  for  synthesizing  l//-like  behavior,  is  difficult  to  exploit  in  analyzing  l//-like 
behavior. 

Some  remarks  concerning  the  conditions  on  the  wavelet  basis  are  also  appropriate. 
We  begin  by  noting  that  to  generate  l//-like  behavior  for  0  <  7  <  2,  it  suffices  to 
use  a  wavelet  basis  for  which  the  corresponding  multiresolution  analysis  is  at  least 
regular.  Again,  virtually  any  practical  wavelet  basis  satisfies  this  condition,  eA^en 
the  Haar  basis.  However,  the  theorem  implies  that  to  generate  l//-like  behavior  for 
7  >  2,  higher  regularity  (i?  >  1)  is  required.  This  can  be  verified  experimentallj'^  as 
well.  We  find,  for  instance,  that  when  we  attempt  to  synthesis  l//-like  behavior  for 
7  =  5  using  bases  with  i?  >  3,  the  sample  functions  are  characterized  by  a  smoothness 
consistent  with  the  decay  in  their  spectra.  However,  when  bases  corresponding  to  i?  < 
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3  are  used  in  the  synthesis,  the  sample  functions  lose  their  characteristic  smoothness. 
Specifically,  using  a  Haar-based  synthesis  {R  =  1),  the  sample  functions  exhibit 
abrupt  discontinuities,  while  using  a  second-order  {R  =  2)  Daubechies  basis  leads 
to  sample  functions  exhibiting  abrupt  discontinuities  in  their  derivatives.  In  effect, 
unless  there  is  sufficient  regularity,  the  characteristics  of  the  basis  functions  manifest 
themselves  in  the  sample  functions  generated  by  the  expansion.  However,  at  least 
in  this  context,  there  would  appear  to  be  no  benefit  to  using  bases  that  have  more 
regularity  than  required  by  the  theorem. 

We  also  remark  that  a  much  stronger  theorem  holds  for  the  case  7  =  0  in  which 
the  coefficients  not  are  not  only  uncorrelated  but  have  identical  variances.  In  this 
case,  constructing  an  expansion  from  such  a  collection  of  random  variables  in  any 
any  orthonormal  basis  yields  stationary  white  noise  whose  spectral  density  is  the 
variance  of  the  coefficients.  In  particular,  for  any  wavelet  basis  we  have 

S,(a;)  =  (r2  =  <72X:i^(2-"*)p 

m 

when  7  =  0  where  the  last  equality  is  a  restatement  of  the  identity  (2.9)  and  demon¬ 
strates  the  consistency  of  this  case  with  (3.36). 

Finally,  we  remark  that  Theorem  3.4  may,  in  principle,  be  extended  to  7  <  0 
provided  the  wavelet  basis  used  in  the  synthesis  has  a  sufficient  number  of  vanishing 
moments.  This  can  be  deduced  from  the  proof  in  Appendix  B.3.  However,  this 
extension  to  the  theorem  was  not  incorporated  principally  because  there  would  appear 
to  be  relatively  few,  if  any,  physical  processes  corresponding  to  negative  7. 

Analysis 

In  this  section,  we  derive  a  collection  of  complementary  results  to  suggest  that  wavelet 
bases  are  equally  useful  in  the  analysis  of  1//  processes.  In  particular,  we  provide 
both  theoretical  and  empirical  evidence  suggesting  that  when  l//-like  processes  are 
expanded  in  terms  of  orthonormal  wavelet  bases,  the  resulting  wavelet  coefficients 
are  typically  rather  weakly  correlated,  particularly  in  contrast  to  the  rather  strong 
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correlation  present  in  the  original  process.  These  results,  combined  with  those  of  the 
last  section,  provide  evidence  of  that  such  wavelet-based  representations  are  robust 
characterizations  of  l//-like  behavior. 

Virtually  all  the  results  we  obtain  in  this  section  are  derived  conveniently  and 
efficiently  in  the  frequency-domain.  In  anticipation  of  these  derivations,  we  first 
establish  the  following  proposition,  whose  proof  is  outlined  in  Appendix  B.4. 

Proposition  3.5  Let  x{t)  he  a  1/f  process  whose  spectral  parameters,  in  the  sense 
of  Theorem  3.2,  are  and  7  >  0.  Furthermore,  let  the  wavelet  coefficients  be 
the  projections  of  x{t)  onto  some  orthonormal  wavelet  basis.  Then  the  correlation 
between  an  arbitrary  pair  of  such  coefficients  and  a;™'  is  given  by 


E  = 


2-(m+m')/2  ,00  (^2 


du  (3.40) 


for  any  choice  of  'ip{t)  and  7  such  that  this  integral  is  convergent. 

This  principal  shortcoming  of  this  proposition  is  that  it  fails  to  establish  condi¬ 
tions  on  the  wavelet  basis  and  7  under  which  (3.40)  is  defined.  Nevertheless,  we 
may  generally  use  Proposition  3.5  to  derive  properties  of  the  second-order  statistics 
of  wavelet  coefficients  of  Iff  processes  for  7  >  0.  For  instance,  an  immediate  con¬ 
sequence  of  the  proposition  is  that  we  can  show  the  variance  of  the  x^  to  be  of  the 
form 

Vara:!"  =  0^2-^^ 


where 


1  fOO 


To  obtain  this  result,  it  suffices  to  let  m'  =  m  and  n'  =  n  in  (3.40)  and  effect  a  change 


of  variables. 
Defining 


m,m'  A 
Pn,n' 


'^(Var  a:5J*)(  Var  x'^! ) 


(3.41) 


as  the  normalized  wavelet  correlation,  a  second  consequence  is  that  the  wavelet  co- 
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efficients  are  wide-sense  stationary  at  each  scale,  i.e.,  for  a  fixed  scale  m,  is  a 
function  only  of  n  —  n'.  Specifically,  we  may  readily  establish  that 

1  fOO  /T^ 

Again,  this  result  may  be  obtained  by  specializing  (3.40)  to  the  case  m'  =  m  and 
effecting  a  change  of  variables. 

We  can  also  show  that  the  normalized  wavelet  coefficients  possess  a  kind  of  sta- 
tionarity  across  scales  as  well.  Recalling  from  Section  2.2.1  the  critically-sampled 
filter  bank  interpretation  of  the  wavelet  decomposition,  whereby  the  output  of  the 
mth  filter  was  sampled  at  rate  t  =  2“’"n  for  n  =  . . . ,  — 1, 0, 1, 2, . . .,  we  note  that 
a  pair  of  wavelet  coefficients  x™  and  ijji’  at  distinct  scales  m  and  m'  correspond  to 
synchronous  time-instants  precisely  when 

2-"*n  =  2-™'n'.  (3.43) 

Our  stationarity  result  in  this  case  is  that  the  normalized  correlation  among  time- 
synchronous  wavelet  coefficients  corresponding  to  scales  m  and  m'  is  a  function  only 
of  m  —  m'.  More  precisely,  we  can  show  that  whenever  (3.43)  holds, 

1  2 

Again,  this  result  follows  from  specializing  (3.40)  and  effecting  a  change  of  variables. 

The  above  results  verify  that  the  wavelet  coefficients  of  1//  processes  obey  the 
variance  progression  anticipated  from  the  synthesis  result.  Moreover,  the  stationar¬ 
ity  results  provide  insight  into  the  correlation  structure  among  wavelet  coefficients. 
However,  what  we  seek  ideally  are  good  bounds  on  the  magnitude  of  the  correlation 
among  wavelet  coefficients  both  in  the  case  that  they  reside  at  the  same  scale,  and  in 
the  case  they  reside  at  distinct  scales.  Certainly,  as  we  will  see,  there  is  strong  empiri¬ 
cal  evidence  that  the  correlation  among  coefficients  is  rather  small  and,  in  most  cases, 
negligible.  The  following  theorem  provides  some  theoretical  evidence  by  establishing 
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an  asymptotic  result.  A  proof  is  provided  in  Appendix  B.5. 


Theorem  3.6  Consider  an  orthonormal  wavelet  basis  such  that  ip{t)  has  R  vanishing 
moments,  i.e., 

=  r  =  0,1,  •••,/?- 1  (3.45) 

for  some  integer  R>1.  Then  provided  0  <  7  <  2R,  the  wavelet  coefficients  obtained 
by  projecting  al/f  process  onto  this  basis  have  a  correlation  whose  magnitude  decays 
according  to^ 

\Pn:^'\  ~  O  [\2-^n  -  (3,40) 

as 

\2-^n  -  2-^'n'\  00. 

While  this  theorem  makes  an  interesting  statement  about  the  relative  correlation 
among  some  wavelet  coefficients  well-separated  in  (m,  n)-space,  we  must  avoid  infer¬ 
ring  some  stronger  statements.  First,  it  says  nothing  about  the  correlation  among 
time-synchronous  wavelet  coefficients  {i.e.,  those  satisfying  (3.43)),  regardless  of  how 
well-separated  they  are.  Furthermore,  while  plausible,  the  theorem  itself  does  not 
assert  that  choosing  an  analysis  wavelet  with  a  larger  number  of  vanishing  moments 
can  reduce  the  correlation  among  wavelet  coefficients  in  the  analysis  of  1//  pro¬ 
cesses.  Likewise,  the  theorem  does  not  actually  validate  the  reasonable  hypothesis 
that  choosing  a  wavelet  with  an  insufficient  number  of  vanishing  moments  will  lead 
to  strong  correlation  among  the  wavelet  coefficients  of  1//  processes.  In  fact,  the 
theorem  identifies  neither  a  range  of  m,m',n,n'  over  which  (3.46)  holds,  nor  a  lead¬ 
ing  multiplicative  constant  in  (3.46).  Consequently,  this  precludes  us  from  inferring 
anything  about  the  absolute  correlation  between  any  particular  pair  of  coefficients. 

For  the  case  of  the  ideal  bandpass  wavelet  basis,  however,  we  may  obtain  some 
more  useful  bounds  on  the  correlation  among  wavelet  coefficients.  In  this  case,  the  ba¬ 
sis  functions  corresponding  to  distinct  scales  have  non-overlapping  frequency  support. 
Hence,  carefully  exploiting  the  stationarity  properties  of  If f  processes  developed  in 

^The  ceiling  function  \x]  denotes  the  smallest  integer  greater  than  or  equal  to  x. 
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Theorem  3.2,  we  conclude  that  the  wavelet  coefficients  corresponding  to  distinct  scales 
are  uncorrelated,  However,  at  a  given  scale  the  correlation  at  integral  lag  ^  >  0  is  non¬ 
zero  and  may  be  expressed  as 

^  r 

Jr 


While  (3.47)  cannot  be  evaluated  in  closed  form,  integrating  by  parts  twice  and  using 
the  triangle  inequality  gives  the  useful  closed-form  bound 


Kr-/i  < 


7 

^2 


1  -f  7 


(3.49) 


valid  for  7  >  0  and  integer- valued  /  >  1 . 

In  Fig.  3-4,  we  plot  the  exact  magnitude  of  the  normalized  correlation  (3.47) 
obtained  by  numerical  integration  as  a  function  of  lag  I  together  with  the  bound 
(3.49).  Note  that  correlation  among  wavelet  coefficients  is  extremely  small;  adjacent 
coefficients  have  a  correlation  coefficient  of  less  than  15  percent,  and  more  widely 
separated  coefficients  have  a  correlation  coefficient  less  than  three  percent.  Hence, 
it  is  not  an  unreasonable  approximation  to  neglect  the  inter-coefficient  correlation  in 
any  analysis  using  this  wavelet  basis. 

On  the  same  plot  we  superimpose  the  average  along-scale  sample-correlation  be¬ 
tween  wavelet  coefficients  obtained  from  a  1/ /-type  process  generated  using  Keshner’s 
synthesis.  In  this  simulation,  a  65536-sample  segment  of  a  1//  process  was  gener¬ 
ated  for  7  =  1  and  analyzed  using  Daubechies  5th-order  wavelet  basis.  Here,  the 
sample-correlation  function  of  the  coefficients  at  each  scale  was  computed,  and  aver¬ 
aged  appropriately  with  the  sample-correlation  functions  at  the  other  scales.  That 
the  experimental  result  so  closely  matches  the  exact  result  for  the  bandlimited  ba¬ 
sis  suggests  that  our  analysis  result  for  the  bandlimited  basis  may,  in  fact,  be  more 
broadly  applicable. 


lag  / 

Figure  3-4:  Along-scale  correlation  between  wavelet  coefficients  for  an  exactly-! / f 
process  for  which  7  =  1.  The  squares  □  indicate  a  numerical  estimate  of  the  exact 
magnitude  of  the  normalized  correlation  between  wavelet  coefficients  as  a  function  of 
the  lag  I  between  them.  The  ideal  bandpass  wavelet  was  assumed  in  the  analysis. 
The  triangles  A  indicate  the  corresponding  values  of  the  closed-form  bound  obtained 
in  the  text.  The  circles  Q  show  the  average  sample-correlation  as  computed  from 
a  projections  of  a  Iff  process  generated  using  Keshner’s  synthesis  onto  a  5th-order 
Daubechies  wavelet  basis. 
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Before  concluding  this  analysis  section,  it  is  appropriate  to  note  that  several  of  the 
results  we  have  described  herein  have  been  derived  independently  for  the  particular 
case  of  fractional  Brownian  motion  using  a  time-domain  approach.  For  instance,  the 
stationarity  of  the  wavelet  coefficients  at  a  fixed  scale  was  first  established  by  Flandrin 
in  [29],  while  the  inter-scale  stationarity  property  was  described  by  Flandrin  (after 
Vergassola  and  Frisch  [54])  in  the  recent  work  [55].  Likewise,  the  expression  for  the 
asymptotic  rate-of-decay  of  correlation  among  wavelet  coefficients  is  essentially  the 
same  as  that  first  derived  by  Tewfik  and  Kim  [56].  We  also  mention  that  Flandrin  [55] 
is  able  to  provide  stronger  statements  about  the  correlation  among  wavelet  coefficients 
of  fractional  Brownian  motion  for  the  specific  case  of  the  Haar  wavelet  basis.  Finally, 
we  remark  that  it  ought  to  be  possible  to  interpret  the  decorrelation  results  presented 
both  here  and  in  the  works  of  the  above  authors  in  the  context  of  more  general 
results  that  have  emerged  concerning  the  effectiveness  of  wavelet  decompositions  in 
decorrelating  a  broad  class  of  smooth  covariance  kernels  [24]. 

Finally,  it  is  useful  to  remark  that  through  the  octave-band  filter  bank  interpre¬ 
tation  of  wavelet  bases,  we  may  view  wavelet-based  analysis  as  spectral  analysis  on 
a  logarithmic  frequency  scale.  The  results  of  this  section,  together  with  our  obser¬ 
vations  of  the  spectral  characteristics  of  1//  processes  earlier  in  the  chapter,  suggest 
that  this  kind  of  spectral  analysis  is,  in  some  sense,  ideally  matched  to  l//-type  be¬ 
havior.  In  the  final  section  of  this  chapter,  we  undertake  such  a  log-based  spectral 
analysis  of  some  real  data  sets  using  wavelets  and  show  additional  evidence  that  such 
analysis  is  potentially  both  useful  and  important  in  these  cases. 

Experiments 

In  this  section,  we  undertake  a  very  preliminary  investigation  of  the  properties  of 
wavelet  coefficients  derived  from  some  physical  data  sets.  In  the  process,  we  identify 
two  instances  of  time  series  that  are  would  appear  to  be  well-modeled  as  1//  processes. 
The  first  is  an  example  involving  economic  data.  Fig.  3-5  shows  the  time  series 
corresponding  to  raw  weekly  Dow  Jones  Industrial  Average  data  accumulated  over  the 
past  80  years.  As  shown  in  Fig.  3-6(a),  the  sample- variance  of  wavelet  coefficients  from 
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week 

Figure  3-5:  Weekly  Dow  Jones  Industrial  Average  data,  to  present. 

scale  to  scale  obeys  a  geometric  progression  consistent  with  a  1/f  process  for  which 
7  ~  2.  In  Fig.  3-6{b),  we  see  that  the  average  along-scale  sample-correlation  among 
Avavelet  coefficients  is  rather  weak.  Since  adjacent  coefficients  have  a  correlation  of 
less  than  8  percent,  and  more  widely  separated  coefficients  have  a  correlation  of  less 
than  3  percent,  it  would  appear  reasonable  to  neglect  the  inter-coefficient  correlation 
in  the  analysis  of  such  data.  While  this  behavior  is  also  consistent  with  a  1/ /-type 
-model  for  the  data,  we  note  that  to  justify  such  a  model  more  fully,  it  would  be 
necessary  to  study  the  correlation  among  coefficients  between  scales  as  well. 

The  second  example  involves  physiological  data.  Fig.  3-7  shows  a  record  of  heart 
beat  interarrival  times  for  a  healthy  human  patient  corresponding  to  approximately 
11  hours  of  continuously-acquired  data.  The  quantization  levels  of  the  interarrival 
times  are  spaced  four  milliseconds  apart.  In  this  example,  as  shown  in  Fig.  3-8(a) 
the  sample-variances  of  wavelet  coefficients  from  scale  to  scale  obeys  a  geometric 
progression  consistent  with  a  1/f  process  of  7  ~  1.  When  viewing  these  progressions  it 
is  important  to  note  that  the  number  of  samples  available  to  make  a  variance  estimate 
doubles  at  each  successively  finer  scale.  Hence,  the  standard  deviation  of  the  sample- 
rariance  measurement  decreases  by  a  factor  of  \/2  for  each  successive  increase  in  m. 
As  a  result,  1/f  behavior  manifests  itself  in  the  form  of  log- variance  characteristic 
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(a)  Scale-to-scale  wavelet  coefficient  sample-variance  progression. 


lag  I 

(b)  Average  magnitude  of  the  normalized  along-scale  sample-correlation  between  wavelet  coef¬ 
ficients. 

Figure  3-6;  Wavelet-based  analysis  of  weekly  Dow  Jones  Industrial  Average  data. 
The  time-series  is  analyzed  using  a  5th-order  Daubechies  wavelet  basis. 
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Figure  3-7:  Heartbeat  interarrival  times  for  a  healthy  patient. 

that  must  be  asymptotically  linear  in  the  limit  of  large  m.  In  Fig.  3-8(b),  we  show 
the  weak  average  along-scale  sample-correlation  between  wavelet  coefficients.  In  this 
case,  coefficients  separated  by  lags  of  two  or  more  are  correlated  less  than  2  percent, 
again  suggesting  that  it  is  reasonable  to  neglect  such  inter-coefficient  correlation  in 
any  wavelet-based  analysis.  Again,  we  caution  that  no  attempt  was  made  to  study 
the  correlation  structure  among  coefficients  between  scales. 

Together,  our  theoretical  and  empirical  results  suggest  that  the  orthonormal  wave¬ 
let  transform  is  a  potentially  useful  and  convenient  tool  in  the  synthesis  and  analysis 
of  1/ /-type  processes.  In  the  next  chapter  we  will  explore  hoAv  the  wavelet  transform 
plays  an  equally  valuable  role  in  the  processing  of  such  signals. 
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(a)  Scale-to-scale  wavelet  coefficient  sample-variance  progression. 


lag  / 

(b)  Average  magnitude  of  the  normalized  along-scale  sample-correlation  between  wavelet  coef¬ 
ficients. 

Figure  3-8;  Wavelet-based  analysis  of  the  heartbeat  interarrival  times  for  a  health 
patient.  The  time-series  is  analyzed  using  a  5th-order  Daubechies  wavelet  basis. 
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Chapter  4 


Detection  and  Estimation  with  1/f 
Processes 


Given  the  ubiquity  of  physical  signals  exhibiting  l//-type  behavior,  there  are  many 
application  contexts  in  which  one  is  interested  in  developing  efficient  algorithms  for 
processing  such  signals.  For  instance,  one  is  frequently  interested  in  problems  of 

-  detection  and  classification 

-  characterization  and  parametrization 

-  prediction  and  interpolation,  and 

-  separation  of  1//  signals  both  from  one  another  as  well  as  from  other  types  of 
known  or  partially-known  signals. 

In  some  cases,  the  1//  signal  itself  is  of  primary  interest.  An  example  would  be  the 
problem  of  modeling  stock  market  data  such  as  the  Dow  Jones  Industrial  Average  as  a 
1/f  process.  In  other  cases,  the  1/f  signal  represents  a  noise  process  obscuring  some 
other  signal  of  interest.  This  is  more  likely  to  be  the  case  in  optical  and  electronic 
systems,  for  example,  where  1//  is  a  predominant  form  of  background  noise. 

Even  when  the  1/f  signal  is  of  primary  interest,  one  rarely  has  perfect  access  to 
such  signals.  T)^pically,  our  observations  are  incomplete.  Indeed,  they  will  generall}' 
be  time-limited  and  resolution-limited.  More  generally,  the  observations  may  contain 
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gaps,  or  there  may  be  multiple  observations.  Additionally,  any  observations  of  1// 
signals  will  invariably  be  corrupted  by  some  degree  of  broadband  measurement  noise^. 
It  is  important  to  both  recognize  and  accommodate  such  measurement  noise  in  any  al¬ 
gorithms  for  processing  such  data.  Indeed,  because  1//  signals  have  a  spectral  density 
that  vanishes  for  sufficiently  high  frequencies,  there  necessarily  exists  some  frequency 
threshold  beyond  which  the  broadband  noise  is  predominant.  As  a  consequence,  such 
noise  can  strongly  affect  the  performance  of  signal  processing  algorithms. 

In  this  chapter,  we  develop  some  optimal  algorithms  for  addressing  a  number  of 
basic  signal  processing  problems  involving  detection  and  estimation  with  1  /  /-type  sig¬ 
nals.  Our  basic  approach  is  to  exploit  the  properties  of  wavelet-based  representations 
of  l//-type  processes.  In  particular,  based  upon  the  synthesis  result  of  Theorem  3.4 
and  supported  by  the  subsequent  analysis  results,  our  model  for  1//  signals  will  be 
signals  which  when  expanded  into  an  orthonormal  wavelet  basis,  yield  coefficients  that 
are  effectively  uncorrelated  and  obey  the  appropriate  variance  progression.  That  is, 
we  will  exploit  the  role  of  the  wavelet  expansion  as  a  Karhunen-Loeve-like  expansion 
for  1  //-type  processes.  Because  extremely  efficient  algorithms  exist  for  computing  or¬ 
thonormal  wavelet  transformations  as  discussed  in  Section  2.2.3,  this  approach  is  not 
only  analytically  convenient  for  solving  these  signal  processing  problems,  but  leads  to 
computationally  highly  efficient  structures  for  implementing  the  resulting  algorithms. 

Throughout  we  will  be  careful  to  incorporate  additive  stationary  white  measure¬ 
ment  noise  for  ensuring  the  robustness  of  our  models  as  described  above.  Further¬ 
more,  most  of  the  algorithms  we  develop  will  be  designed  specifically  for  the  case  of 
Gaussian  1  //  processes  and  Gaussian  measurement  noises.  While  this  requirement  is 
principally  motivated  by  tractability  requirements,  there  are,  in  fact,  many  contexts 
in  which  this  is  a  physically  reasonable  assumption.  Furthermore,  several  of  the  al¬ 
gorithms  we  develop  retain  many  of  their  important  properties  in  the  more  general 
non-Gaussian  case. 

'Actually,  the  coexistence  of  1//  and  white  noises  in  electronic  and  optical  systems  is  well- 
documented.  In  electronic  systems,  for  instance,  the  predominant  noise  is  1//  noise  at  frequencies 
below  about  1  kHz,  while  at  higher  frequencies,  it  is  white  noise  in  the  form  of  thermal  (t.e.,  Johnson) 
and  shot  noise  [57]. 
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Accompanying  each  of  the  algorithms  we  develop  is  an  evaluation  of  its  perfor¬ 
mance  in  various  settings.  It  is  important  to  realize  that  such  evaluations  are  neces¬ 
sarily  of  a  highly  preliminary  nature,  and  are  not  meant  to  be  interpreted  as  in  any 
way  comprehensive.  Their  primary  function  is  to  establish  the  basic  viability  of  the 
algorithms,  and  to  reveal  some  of  their  basic  properties.  Many  of  these  performance 
studies  involve  Monte  Carlo  simulations  with  synthetic  data.  For  these  simulations, 
we  generate  1  //  processes  using  the  Corsini-Saletti  implementation  of  Keshner’s  syn¬ 
thesis  described  in  [52].  Because  this  synthesis  is  fundamentally  different  from  a 
wavelet-based  synthesis,  such  simulations  play  an  important  role  in  verifying  the  ro¬ 
bustness  of  the  wavelet-based  algorithms  with  respect  to  our  particular  model  for 
l//-type  behavior.  However,  by  their  design,  these  simulations  will  generally  not 
enable  us  to  isolate  the  effects  of  modeling  error  alone. 

Additionally,  there  are  a  large  number  of  wavelet  bases  from  which  to  select  for 
our  algorithms.  However,  given  the  apparent  insensitivity  of  the  wavelet-based  model 
for  l//-type  behavior  to  the  choice  of  basis,  for  our  simulations  we  choose,  some¬ 
what  arbitrarily,  to  use  the  basis  corresponding  to  Daubechies’  5th-order  finite-extent 
maximally-regular  wavelet  for  which  the  corresponding  conjugate  quadrature  filters 
have  10  non-zero  coefficients.  We  remark  that  in  addition  to  being  realizable,  this 
basis  satisfies  the  conditions  of  the  theorems  of  Section  3.2.2  concerning  the  synthe¬ 
sis  and  analysis  of  l//-type  behavior  using  wavelets.  Specifically,  the  basis  has  more 
than  enough  vanishing  moments  to  accommodate  spectral  parameters  in  our  principal 
range  of  interest,  0  <  7  <  2. 

Until  recently,  such  problems  of  detection  and  estimation  involving  l//-type  pro¬ 
cesses  received  relatively  little  attention  in  the  literature.  However,  there  has  been 
strongly  increasing  interest  in  the  topic  and  a  number  of  interesting  and  useful  related 
results  have  been  developed.  An  important  instance  is  the  elegant  work  of  Barton 
and  Poor  [42],  who  consider  problems  of  detection  in  the  presence  of  fractional  Gaus¬ 
sian  noise  using  reproducing  kernel  Hilbert  space  theory.  Using  this  framework,  the 
authors  are  able  to  develop  both  infinite-  and  finite-interval  whitening  filters  for  this 
class  of  1//  noises,  which,  in  turn,  allow  them  to  obtain  some  important  results  on 
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the  detection  of  deterministic  and  Gaussian  signals  in  the  presence  of  such  noise. 

There  is  also  a  substantial  and  growing  body  of  recent  literature  on  the  general 
topic  of  multiresolution  stochastic  processes,  systems,  and  signal  processing.  An  ex¬ 
cellent  overview  of  work  in  this  area  is  contained  in  [58]  and  the  references  therein.  In 
this  work,  the  authors  develop  a  clever  tree-  or  lattice-based  framework  for  modeling 
multiscale  processes  and  problems,  and  introduce  some  novel  notions  of  “stationarity 
in  scale”  for  such  processes.  Treating  multiscale  processes  as  “dynamical  systems  in 
scale,”  lead  to  several  highly  efficient  algorithms  for  addressing  a  variety  of  problems 
involving  parameter  estimation,  signal  smoothing  and  interpolation,  and  data  fusion. 
The  l//-type  models  we  exploit  in  this  chapter  constitute  a  special  class  of  the  miil- 
tiresolution  stochastic  processes  developed  by  these  authors.  In  particular,  they  are 
examples  of  processes  characterized  by  a  “Markov  scale-to-scale”  property.  As  a  con¬ 
sequence,  many  of  the  multiresolution  signal  processing  algorithms  they  develop  are 
directly  applicable  to  1//  processes  as  shown,  e.p.,  in  [23]. 

There  are  also  interesting  parallels  between  the  results  of  this  chapter  and  recent 
work  by  Tewfik  and  Kim.  For  example,  in  [59],  these  authors  develop  the  notion 
suggested  by  Beylkin,  Coifman  and  Rokhlin  in  [24],  that  wavelet-based  and  other, 
more  general,  filter  bank  decompositions  are  useful  in  transforming  the  correlation 
structure  in  a  broad  class  of  stationary  and  nonstationary  processes  (including,  e.g.,  a 
class  of  1//  processes)  into  a  form  more  convenient  for  signal  processing.  Specifically, 
they  show  how  the  structure  of  the  correlation  in  such  decompositions  leads  directly 
to  computationally  efficient,  general  purpose  algorithms  for  signal  processing.  As  an 
example,  [60]  summarizes  how  their  results  can  be  applied  to  the  problem  of  signal 
detection  in  the  presence  of  fractional  Brownian  motion. 

4.1  1//  Synthesis  and  Whitening  Filters 

Many  of  the  results  on  detection  and  estimation  we  will  derive  in  this  chapter  are 
conveniently  interpreted  in  a  canonical  form  through  the  concept  of  a  reversible  (or 
invertible)  whitening  filter  for  1//  processes.  In  this  section,  we  derive  such  whitening 
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filters  and  their  inverses  for  the  particular  wavelet-based  model  for  1  //-type  behavior 
which  we  intend  to  exploit  in  this  chapter. 

To  begin,  if  x{t)  is  a  1//  signal  corresponding  to  some  spectral  exponent  7,  we 
model  the  corresponding  wavelet  coeflBcients  xJJ*  as  zero-mean  random  variables  hav¬ 
ing  negligible  correlation  and  obeying  a  variance  progression  of  the  form 

Varar;^  = 

where,  for  notational  convenience,  we  define 

/?  =  2^.  (4.1) 


In  turn,  we  may  express  the  arJJ*  as 

where  the  are  then  zero-mean,  unit-variance,  uncorrelated  random  variables. 
Hence,  the  process  v(t)  defined  according  to 

m  n 

is  a  wide-sense  stationary  white  noise  process  since  the  constitute  a  complete 

orthonormal  set.  This  suggests  that  we  may  model  x(t)  as  the  output  of  a  linear 
system  driven  by  stationary  white  noise  t;(t).  In  particular,  the  system  performs 
an  orthonormal  wavelet  transform  on  the  input  v{t),  scales  each  of  the  resulting 
coefficients  by  a  factor 

k;^  = 

then  inverse  wavelet  transforms  the  resulting  x'^  to  generate  the  output  process  x{t). 
This  1//  synthesis  filter,  defined  via 

}  (4.2) 


84 


is  a  linear  filter  whose  kernel^  is 


<(«.  -r)  =  E  E  c(<)<'r”'V:(T).  (4.3a) 

m  n 

We  emphasize  that  viewing  x(<)  as  the  output  of  a  linear  system  with  kernel  (4.3a) 
driven  by  stationary  white  noise  v{t)  is  especially  useful  in  the  Gaussian  scenario, 
in  which  case  w{t)  is  a  stationary  white  Gaussian  process.  Nevertheless,  for  non- 
Gaussian  processes  this  characterization  remains  useful  at  least  insofar  as  modeling 
the  second-order  properties  of  x{t)  is  concerned. 

From  the  wavelet- based  characterization  of  the  synthesis  filter  (4.2)  we  readily 
deduce  that  this  filter  is  invertible,  and  that  its  inverse  has  kernel 

(4.3b) 

m  n 

This  is,  therefore,  the  corresponding  whitening  filter  for  our  model  of  l//-type  behav¬ 
ior.  Indeed,  when  this  filter  is  driven  by  a  process  obtained  as  the  output  of  our  1// 
synthesis  filter,  the  output  is,  evidently,  a  wide-sense  stationary  white  process.  When 
driven  by  an  exactly-1//  process,  the  properties  of  the  output  are  readily  described 
in  terms  of  the  analysis  results  of  Section  3.2.2. 

As  discussed  at  the  outset  of  the  chapter,  any  1  / /-type  process  we  consider  shall 
invariably  be  accompanied  by  an  additive  stationary  white  observation  noise  compo¬ 
nent.  Consequently,  we  shall  frequently  find  the  notion  of  synthesis  and  whitening 
filters  for  the  combined  1  / /-plus- white  processes  convenient  in  interpreting  our  algo- 
rithzns.  These  filters  are,  of  course,  closely  related  to  the  filters  derived  above.  In 
fact,  it  is  straightforward  to  establish  that  synthesis  and  whitening  filters  for  1//- 

^In  our  notation,  the  kernel  k(t,  t)  of  a  linear  system  defines  the  response  of  the  system  at  time 
t  to  a  unit  impulse  at  time  r.  Consequently  the  response  of  the  system  to  a  suitable  input  i(t)  is 
expressed  as 

yit)  = 


/: 


x(r)  fe(t,  r)  dr. 
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plus-white  processes  are  characterized  by  the  respective  kernels 


Ks{t,r)  =  {4-4a) 

m  n 

fiw{t,r)  =  (4.4b) 

m  n 

where  cr;„  >  0  is  defined  by 

<^1  =  (4.5) 

and  (7^  is  the  spectral  density  of  the  white  noise  component.  The  canonical  wavelet- 
based  realization  of  these  filters  is  depicted  in  Fig.  4-1. 

4.2  Parameter  Estimation  for  1/f  Signals 

Ill  this  section,  we  consider  the  problem  of  estimating  the  parameters  of  a  Gaussian 
1  //  signal  from  observations  corrupted  by  stationary  white  Gaussian  noise.  Since, 
typically,  we  lack  a  priori  knowledge  of  the  spectral  density  of  the  noise,  we  consider, 
more  general!}',  the  problem  of  jointly  estimating  signal  and  noise  parameters  for  this 
scenario. 

Such  parameter  estimates,  in  addition  to  providing  a  solution  to  the  associated 
1/f  spectrum  estimation  problem,  are  frequently  of  interest  in  their  own  right.  In¬ 
deed,  from  the  parameter  estimates,  we  can  directly  compute  the  fractal  dimension 
of  the  underlying  signal,  when  defined.  Robust  estimation  of  the  fractal  dimension 
of  1  //  processes  is  important  in  a  number  of  applications  such  as  in  signal  detection 
and  classification.  For  example,  in  image  processing,  where  2-D  extensions  of  1/f 
processes  are  used  to  model  natural  terrain  and  other  patterns  and  textures  [32]  [43], 
fractal  dimension  can  be  of  use  in  distinguishing  among  various  man-made  and  nat¬ 
ural  objects.  While  several  approaches  to  the  fractal  dimension  estimation  problem 
have  been  presented  previously  in  the  literature  (see  [43],  [61],  [62],  and  the  references 
therein),  to  date  none  has  been  able  to  adequately  handle  the  presence  of  broadband 
noise  in  the  observation  data.  In  fact,  the  quality  of  the  estimates  generally  deterio¬ 
rates  dramatically  in  the  presence  of  such  noise  even  at  high  SNR  [43].  Since  noise  is 


86 


(a)  Synthesis  filter,  kernel  Ks(t,  r). 


x(t)+w(t) 


a 


^-ym  - 

2+0^ 


(b)  Whitening  filter,  kernel  Kui{t,T). 

Figure  4-1 :  Canonical  form  realizations  of  synthesis  and  whitening  Rlters  for  processes 
that  are  the  superposition  of  If f -type  and  white  components,  i.e.,  1  / f -plus-white 
processes. 
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inherently  present  in  any  real  data,  this  lack  of  robustness  has  limited  the  usefulness 
of  these  algorithms.  In  this  section  we  obtain,  indirectly,  fractal  dimension  estimators 
for  Gaussian  1  //  processes  that  explicitly  take  into  account  the  presence  of  additive 
white  Gaussian  observation  noise.  The  resulting  iterative  estimation  algorithms  are 
computationally  efficient,  robust,  and  statistically  consistent. 

Our  basic  approach  is  to  apply  the  method  of  Maximum  Likelihood  (ML)  esti¬ 
mation,  exploiting  the  wavelet-based  characterization  of  our  1//  model.  And,  while 
we  specifically  consider  the  case  of  Gaussian  Iff  processes  corrupted  by  additive 
stationary  white  Gaussian  measurement  noise  in  our  formulation  of  the  problem,  we 
emphasize  that  the  resulting  estimators  are,  in  fact,  applicable  to  a  broader  class  of 
non-Gaussian  1  //  processes  and  measurement  noise  models,  and  retain  many  desir¬ 
able  properties. 

We  formulate  our  problem  as  follows.  Suppose  we  have  observations  r{t)  of  a 
zero-mean  Gaussian  1//  process  a:(t)  embedded  in  zero-mean  additive  stationary 
white  Gaussian  noise  w{t)  that  is  statistically  independent  of  x{t)^  so 

r(t)  =  x{t)  -f-  w{t),  — oo  <t<oo.  (4.6) 

From  this  continuous-time  data,  we  assume  we  have  extracted  a  number  of  wavelet 
coefficients,  r™.  In  theory,  we  may  assume  these  coefficients  are  obtained  by  projecting 
the  wavelet  basis  functions  onto  the  observed  data; 

C=  r  Writ) dt. 

J—oo 

However,  in  practice,  these  coefficients  can  be  obtained  by  applying  the  computation¬ 
ally  efficient  DWT  to  the  samples  of  a  segment  of  data  which  is  both  time-limited 
and  resolution-limited,  as  described  in  Section  2.2.3.  Let  us  assume  that  the  finite 
set  of  a\^ilable  distinct  scales,  M,  is,  in  increasing  order, 

A4  =  {mi,m2,...,mM},  (4.7a) 
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and  that  at  each  scale  m  the  set  of  available  coefficients  A/’(m)  is^ 

A/'(m)  =  {ni(m),  n2{m), . . . ,  niv(m)("i)}.  (4.7b) 

Hence,  the  data  available  to  the  estimation  algorithm  is 

T  =  {r^  eTZ}  =  {r^,m  e  M,n  e  (4.8) 

We  remark  before  proceeding  that,  based  on  the  discussion  in  Section  2.2.4,  for 
an  implementation  via  the  DWT  with  N  =  Nq2^  samples  of  observed  data,  we  have, 
typically, 

M  =  {1,2,...,M}  (4.9a) 

M{m)  =  {l,2,...,iVo2’"-'},  (4.9b) 

where  Nq  is  a  constant  that  depends  on  the  length  of  the  filter  h[n].  Consequently, 
while  many  of  the  results  we  derive  will  be  applicable  to  the  more  general  scenario, 
we  will  frequently  specialize  our  results  to  this  case. 

Exploiting  the  Karhunen-Loeve-like  properties  of  the  wavelet  decomposition  for 
l//-type  processes,  and  using  the  fact  that  the  are  independent  of  the  and 
are  decorrelated  for  any  wavelet  basis,  the  resulting  observation  coefficients 


can  be  modeled  as  mutually-independent  zero-mean,  Gaussian  random  variables  with 
variance 

where  (3  is  defined  in  terms  of  the  spectral  exponent  7  of  the  1//  process  according 

^Note  that,  without  loss  of  generality  we  may  assume  ^(jti)  0,  any  m,  or  else  the  corresponding 
scale  m  could  be  deleted  from  M. 
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to  (4.1).  Hence,  it  is  the  parameter  set 


0  =  {p,a^,al) 

we  wish  to  estimate.  As  discussed  at  the  outset,  it  is  often  the  case  that  only  /?  or 
some  function  of  (3  such  as  the  spectral  exponent  7,  the  fractal  dimension  £>,  or  the 
self- similarity  parameter  H,  is  of  interest.  Nevertheless,  and  <7^  will  still  need  to 
be  estimated  simultaneously  as  they  are  rarely  known  a  priori.  Furthermore,  ML 
estimates  of  'y,D,H  are  readily  derived  from  the  ML  estimate  /?ml-  Indeed,  since 
each  of  these  parameters  is  related  to  /?  through  an  invertible  transformation,  we 
have 

7ml  =  log2/dML  (4.10a) 

-^ML  =  (5  —  7ml)/2  (4.10b) 

■^ML  =  (7ml  -  l)/2.  (4.10c) 
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summarize  the  aspects  of  the  data  required  in  the  estimation.  It  is  straightforward 
to  show  that  the  likelihood  function  in  this  case  is  well-behaved  and  bounded  from 
above  on 

/?  >  0,  ^2  >  0,  >  0 

so  that,  indeed,  maximizing  the  likelihood  function  is  reasonable. 

While  we  shall  assume  that  /?,  cr^,  are  all  unknown,  it  will  be  appropriate  during 
the  development  to  also  specialize  results  to  the  case  in  which  cr^  is  known.  Still  more 
specific  results  will  be  described  when  =  0,  corresponding  to  the  case  of  noise-free 
observations.  We  may  also  assume,  where  necessary,  that  all  m  e  M  are  positive 
without  loss  of  generality.  Indeed,  if,  for  example,  mi  <0,  then  we  could  define  new 
parameters  through  the  invertible  transformation 

p  =  p 

for  which  the  observations  correspond  to  positive  scales 

Ad  =  {l,m2  -  mi  -fl, . . .  ,mM  -  mi  +  1} 
and  which  lead  to  the  same  ML  estimates  for  P,a^,a'^. 

4.2.1  Case  I:  Unknown 

Differentiating  L(©)  with  respect  to  cr^ ,0-2,  and  p,  respectively,  it  follows  that  the 
stationary  points  of  L(0)  are  given  as  the  solutions  to  the  equations 

E  =  0 

meM 

p-^T^  =  0 
Y  nip-^Tm  =  0 

meM 
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where 


^  A  N{m) 

tn  o 


However,  these  equations  are  difficult  to  solve,  except  in  special  cases.  Consequently, 
we  utilize  an  iterative  estimate-maximize  (EM)  algorithm  [63]. 

A  detailed  development  of  the  EM  algorithm  for  our  problem  is  given  in  Ap¬ 
pendix  C.  The  essential  steps  of  the  algorithm  are  summarized  below,  where  we 
denote  the  estimates  of  the  parameters  P,a^,a‘^  generated  on  the  /th  iteration  b}'^ 


E  step:  As  shown  in  Appendix  C,  this  step  reduces  to  estimating  the  noise  and  signal 
portions  of  the  wavelet  coefficient  variances  at  each  scale  m  €  Ad  using  current 
estimates  of  the  parameters 


S:(©W)  =  (4.13a) 

S;(©W)  =  +  (4.13b) 


where 


.4„(0l'l) 

B-(©W) 

■b;;.(©|") 


_ '^jv _ 


(4.14a) 

(4.14b) 

(4.14c) 


M  step:  This  step  reduces  to  using  these  signal  and  noise  variance  estimates  to 
obtain  the  new  parameter  estimates 


^  =  0  (4.15a) 

m6vM 
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•£  JV(m)Sl(el'i)  [/3l'+‘l]’ 

m£M _ 


^2[/+l] 


where 


C  “ 


Z  JV(m)S:(©W) 

m&M _ 

Y 


Y  'fnN{m)  YL 

m€A< 


(4.15b) 


(4.15c) 


(4.16) 


4.2.2  Case  II;  Unknown;  <7^  Known 

If  is  known,  the  above  algorithm  simplifies  somewhat.  In  particular,  we  may  omit 
the  estimation  (4.15c)  and  replace  occurrences  of  in  the  algorithm  with  the  true 
value  <T^.  This  eliminates  the  need  to  compute  S"(©1^')  and,  hence,  BJ^(©1^1).  The 
resulting  algorithm  is  as  follows. 

E  step:  Estimate  the  signal  portion  of  the  wavelet  coefficient  variances  at  each  scale 
m  E  M  using  current  estimates  of  the  parameters 


(4.17) 


where 


7l„(©>'')  = 


S'{®™) 


a2+a2|'l[/3M]“” 

[;8W]' 


(4.18a) 


(4.18b) 


M  step:  Use  these  signal  variance  estimates  to  obtain  the  new  parameter  estimates 

^l'+b,a2[/+il; 


m^M 


(4.19a) 
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^2(/+l]  _  m6X _ 

IZ  A^(m) 

m&M 

where  Cm  is  as  in  (4.16). 


(4.19b) 


4.2.3  Case  III:  /?,  Unknown;  <7^  =  0 

If  0"^  is  known  (or  assumed)  to  be  zero,  the  EM  algorithm  becomes  unnecessary 
as  the  likelihood  may  be  maximized  directly.  Specifically,  with  =  0,  the  signal 
variance  estimates  are  available  directly  as  Hence  the  estimation  simplifies  to  the 
following: 


Pul  ^  E  CmN{m)alp^  =  0 

(4.20a) 

m&M 

-2  m€M 

(4.20b) 

with  Cm  still  as  in  (4.16). 

It  is  worth  discussing  this  special  case  in  more  detail  not  only  for  its  own  sake, 
but  also  because  it  characterizes  one  of  the  components  of  each  iteration  of  the  EM 
algorithm.  The  derivation  of  the  parameter  estimates  in  this  case  is  essentially  the 
same  as  the  derivation  of  the  M  step  in  the  Appendix.  We  begin  by  differentiating  the 
likelihood  function  to  find  equations  for  its  stationary  points.  This  leads  to  a  pair  of 
equations  in  terms  of  and  /?.  Eliminating  <t^  from  these  equations  is  straightforward 
and  gives  (4.20a)  directly.  Having  determined  /?ml  as  the  solution  to  this  polynomial 
equation,  is  obtained  by  back  substitution. 

From  Lemma  C.l  in  Appendix  C,  it  is  apparent  that  (4.20a)  has  exactly  one 
positive  real  solution,  which  is  the  ML  estimate  Pul-  Hence,  L  has  a  unique  local 
and  hence  global  maximum.  Moreover,  we  may  use  bisection  as  a  method  to  find 
the  solution  to  this  equation,  provided  we  start  with  an  initial  interval  containing 
Pul-  For  instance,  when  we  expect  0  <  7  <  2,  an  appropriate  initial  interval  is 
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1  <  0  <  4.  Naturally,  with  some  caution,  Newton  iterations  may  be  used  to  accelerate 
convergence. 

Again,  since  solving  equations  of  the  form  of  (4.20)  constitutes  the  M  step  of 
the  iterative  algorithm  for  the  more  general  problem,  the  above  remarks  are  equally 
applicable  in  those  contexts. 


4.2.4  Properties  of  the  Estimators 

In  this  section,  we  consider  two  principal  issues: 

-  how  the  parameter  estimates  of  the  EM  algorithm  converge  to  the  ML  parameter 
estimates,  and 

-  how  the  ML  parameter  estimates  converge  to  the  true  parameter  values. 

Regarding  the  first  of  these  issues,  we  are  assured  that  the  EM  algorithm  always 
adjusts  the  parameter  estimates  at  each  iteration  so  as  to  increase  the  likelihood 
function  until  a  stationary  point  is  reached.  It  can  be  shown  that  in  our  problem, 
the  likelihood  function  has  multiple  stationary  points,  one  of  which  corresponds  to 
the  desired  ML  parameter  estimates.  Others  correspond  to  rather  pathological  saddle 
points  of  the  likelihood  function  at  the  boundaries  of  the  parameter  space: 


0  =  /^ML  ,  _ 

(T“.— 0 


0  ■ 

arbitrary 

a‘^  = 

0 

= 

m€At 

Y 

m£M 
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That  they  are  saddle  points  is  rather  fortunate,  for  the  only  way  they  are  reached 
is  if  the  starting  value  for  any  one  of  /?,  <7^,  is  chosen  to  be  exactly  zero.  Given 
arbitrarily  small  positive  choices  for  these  initial  parameters,  the  algorithm  will  iterate 
towards  the  ML  parameters. 

The  preceding  discussion  suggests  that  the  EM  algorithm  is  fundamentally  rather 
robust  in  this  application.  However,  the  selection  of  the  initial  parameter  values  will 
naturally  affect  the  rate  of  convergence  of  the  algorithm.  Moreover,  it  should  be 
noted  that  the  EM  algorithm  converges  substantially  faster  for  the  case  in  which 
is  known.  In  essence,  for  the  general  algorithm  much  of  the  iteration  is  spent  locating 
the  noise  threshold  in  the  data. 

We  now  turn  to  a  discussion  of  the  properties  of  the  ML  estimates  themselves.  It 
is  well-known  that  ML  estimates  are  generally  asymptoticall}'’  efficient  and  consistent. 
This,  specifically,  turns  out  to  be  the  case  here.  It  is  also  the  case  that  at  least  in 
some  higher  signal-to-noise  ratio  (SNR)  scenarios,  the  Cramer-Rao  bounds  closely 
approximate  the  true  estimation  error  variances. 

To  compute  the  Cramer-Rao  bounds  for  the  estimates  of  7,  and  we  con¬ 
struct  the  corresponding  Fisher  matrix 


N(m) 

meM 


[ln2'"<r2/?-'"]2 

-In2'"a2[/j-r«]2 

-  In  2'"cr2^-m 


ln2'"a2[/?-’"]2 


p-m 

1 


(4.21) 


from  which  we  get 


Var7  >  /“ 
V&va^  > 
Vara2  >  /33 


for  any  unbiased  estimates  7,  and  where  Z*’*  is  the  fcth  element  on  the  diagonal 

of  I~L  However,  local  bounds  such  as  these  are  of  limited  value  in  general  both 
because  our  estimates  are  biased  and  because  the  bounds  involve  the  true  parameter 
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values,  which  are  unknown. 

When  (T^  is  known,  the  Fisher  information  matrix  simplifies  to  the  upper  subma¬ 
trix 


1  = 


E 

m€A1 


[ln2"*(T2]2  -ln2"*<T2 

-ln2"*<T2  1 


(4.22) 


from  which  we  get 


Var  7  >  1^^ 
Var<T^  >  1“^. 


As  one  would  expect,  both  the  actual  error  variances  and  the  Cramer-Rao  bounds 
are  smaller  for  this  case.  Note  that  because  the  bounds  are  still  a  function  of  the 
parameters  in  this  case,  their  usefulness  remains  limited.  Nevertheless,  except  in  very 
low  SNR  settings,  the  estimate  biases  are  small  in  a  relative  sense  and  the  estimation 
error  variance  is  reasonably-well  approximated  by  these  bounds.  Hence,  the  bounds 
are  at  least  useful  in  reflecting  the  quality  of  estimation  that  can  be  expected  in 
various  scenarios. 

W’hen  cr^  =  0,  we  get  still  further  simplification,  and  we  can  write 


I  = 


(In  2)2/2  53 

m^M 

— (In2)/(2a2)  53  mA^m) 

mEAi 


-(In2)/(2<T2)  E  mN{m) 

m^M 


from  which  we  get 


Var7  >  2/[(ln2)2j]  53  N{m) 

Var(d2/(T^)  >  2/J  53 

m^M 


where 


J  = 

53  rn^Nlm) 

53  N{m) 

— 

53  miV(m) 

.m^M 

.m€A< 

.m€A^ 

(4.23) 
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In  this  case,  the  bounds  no  longer  depend  on  the  parameters.  Moreover,  in  prac¬ 
tice,  these  expressions  give  an  excellent  approximation  to  the  variances  of  the  ML 
estimates.  Evaluating  the  Cramer-Rao  bounds  asymptotically  for  the  usual  imple¬ 
mentation  scenario  described  by  (4.9),  we  get 

Var7ML  ~  2/ [(In  2)^/^]  (4.24a) 

Var(d^L/‘^^)  ~  2{\og2Nf)/N  (4.24b) 

where  N  is  the  number  of  observation  samples. 

4.2.5  Simulations 

For  the  Monte  Carlo  simulations  of  this  section,  we  synthesize  discrete  samples  of 
resolution-limited  Gaussian  Iff  processes  embedded  in  stationaiy  white  Gaussian 
noise.  In  general,  we  vary  the  length  N  and  SNR  of  the  observations  sequence  as 
well  as  the  spectral  exponent  7  of  the  underlying  Iff  processes.  We  then  perform 
parameter  estimation  using  algorithms  for  the  most  general  scenario,  corresponding 
to  the  case  in  which  all  signal  and  noise  parameters  /3,<T^,cr^  are  unknown. 

In  Fig.  4-2,  we  plot  the  RMS  error  of  the  estimates  of  7  and  for  various  values  of 
7  as  a  function  of  SNR  where  the  observation  sequence  length  is  fixed  to  =  2048. 
The  results  from  64  trials  were  averaged  to  obtain  the  error  estimates  shown.  As 
the  results  suggest,  the  quality  of  the  estimates  of  both  parameters  is  bounded  as 
a  consequence  of  the  finite  length  of  the  observations.  Moreover,  the  bounds  are 
virtually  independent  of  the  value  of  7  and  are  achieved  asymptotically.  For  increasing 
values  of  7,  the  bounds  would  appear  to  be  attained  at  increasing  SNR  thresholds. 

In  Fig.  4-3,  we  plot  the  RMS  error  of  the  estimates  of  7  and  cr^  for  various  values 
of  7  as  a  function  of  observation  sequence  length  N  where  the  SNR  is  fixed  to  20  dB. 
Again,  results  from  64  trials  were  averaged  to  obtain  the  error  estimates  shown.  While 
the  results  show  that  the  estimation  error  decreases  with  data  length  as  expected, 
they  also  suggest,  particularly  for  the  case  of  that  the  convergence  toward  the 
true  parameters  can  be  rather  slow.  Note,  too,  that  a  rather  large  amount  of  data  is 
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Figure  4-2:  RMS  Errors  in  the  estimates  of  the  signal  parameters  as  a  function  of 
the  SNR  of  the  observations.  The  symbols  associated  with  each  7  mark  the  actual 
empirical  measurements;  dashed  lines  are  provided  as  visual  aides  only. 
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Figure  4-2:  Continued. 
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required  before  the  relative  estimation  error  in  can  be  made  reasonably  small. 

We  conclude  this  section  with  a  demonstration  of  the  tracking  capabilities  of  the 
parameter  estimation  algorithm.  Specifically,  Fig  4-4  illustrates  the  performance  of 
the  parameter  estimation  in  tracking  a  step-change  in  the  spectral  exponent  7  of 
a  noise-free  1//  signal.  The  signal  was  constructed  such  that  the  left  and  right 
halves  of  the  signal  correspond  to  7  =  0.90  and  7  =  1.10,  respectively,  but  identical 
variances.  Local  estimates  of  7  are  computed  by  applying  the  Case  III  parameter 
estimation  algorithm  to  the  signal  under  a  sliding  window  of  length  16  384  centered 
about  the  point  of  interest.  Note  that  the  algorithm  not  only  accurately  resolves  the 
appropriate  spectral  exponents,  but  accurately  locates  the  point  of  transition  as  well. 
Finally,  we  remark  that,  as  in  any  such  tracking  algorithm,  using  a  wider  estimation 
window  would  reduce  the  variance  in  the  parameter  estimates  within  each  half  of  the 
waveform,  but  at  the  expense  of  an  increase  in  the  width  of  the  transition  zone. 

4.3  Smoothing  of  1//  Signals 

In  this  section,  we  consider  the  problem  of  extracting  a  1//  signal  from  a  background 
of  additive  stationary  white  noise.  There  are  many  potential  problems  involving 
signal  enhancement  and  restoration  to  which  the  resulting  smoothing  algorithms  can 
be  applied.  For  this  signal  estimation  problem,  we  use  a  Ba5^esian  framework  to 
derive  algorithms  that  are  optimal  with  respect  to  a  mean-square  error  criterion.  We 
specifically  consider  the  Gaussian  case,  for  which  the  resulting  algorithms  not  only 
yield  estimates  having  the  minimum  possible  mean-square  error,  but  correspond  to 
linear  data  processors  as  well.  However,  more  generally,  for  non-Gaussian  scenarios 
the  estimators  we  derive  are  optimal  in  a  linear  least-squares  sense,  i.e.,  no  other  linear 
data  processor  will  be  capable  of  yielding  signal  estimates  with  a  smaller  mean-square 
error  [64]. 

While  we  do  not  specifically  derive  our  signal  estimation  in  terms  of  Wiener  fil¬ 
tering  in  the  frequency  domain,  interpretations  in  this  domain  provide  useful  insight. 
In  particular,  it  is  clear  that  at  high  frequencies  the  white  noise  spectrum  will  dom- 
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Data  Length  N 

(a)  Absolute  RMS  error  in  7ml- 

Figure  4-3:  RMS  Errors  in  the  estimates  of  the  signal  parameters  as  a  function  of  the 
data  length  N  of  the  observations.  Again,  the  symbols  associated  with  each  7  mark 
the  actual  empirical  measurements;  dashed  lines  are  provided  as  visual  aides  only. 


Figure  4-3;  Continued. 
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inate,  while  at  low  frequencies  the  Iff  signal  spectrum  will  dominate.  In  fact,  at 
sufficiently  low  frequencies,  there  will  always  be  arbitrarily  high  SNR  regardless  of 
the  noise  threshold.  Consequently,  Wiener  filtering  for  this  problem  involves  a  form 
of  low-pass  filtering,  where  the  exact  filter  shape  and  “cut-ofF”  are  governed  by  the 
particular  parameters  of  the  noise  and  signal  spectra. 

Our  basic  formulation  is  to  consider  the  estimation  of  a  1//  signal  x{t)  from  noisy 
observations  r{t)  of  the  form  (4.6),  viz., 

r{t)  =  x{t)  -I-  w{t) 

where  w{t)  is  stationary  white  noise,  and  where  we  still  consider  zero-mean  processes. 
We  shall  assume  in  our  derivation  that  the  signal  and  noise  parameters  /?,<7^,cr^  are 
all  known,  though,  in  practice  they  are  typically  estimated  using  the  parameter  algo¬ 
rithms  of  the  last  section.  In  fact,  the  parameter  and  signal  estimation  problems  are 
quite  closely  coupled.  Indeed  it  will  become  apparent  in  our  subsequent  development 
that  smoothing  Avas  inherently  involved  in  the  parameter  estimation  process  as  well. 

We,  again,  exploit  the  wavelet  decomposition  to  obtain  our  results.  Specificallj^, 
we  begin  with  the  set  of  waveiet  coefficients  (4.8).  Then,  since 
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that  the  least-squares  estimates  are  linear  and  given  by 


Note  that,  consistent  with  our  earlier  discussion  of  Wiener  filtering  for  this  prob¬ 
lem,  the  smoothing  factor 

in  (4.26)  has  a  thresholding  role:  at  coarser  scales  where  the  signal  predominates 
the  coefficients  are  retained,  while  at  finer  scales  where  noise  predominates,  the  co¬ 
efficients  are  discarded.  Note,  too,  that  this  factor  appears  in  (4.14c),  which  allows 
us  to  interpret  (4.13b)  in  terms  of  sample-variance  estimates  of  the  smoothed  data. 
Evidently,  smoothing  is  inherently  involved  in  the  parameter  estimation  problem. 

Interpreting  the  optimal  estimator  (4.26)  in  terms  of  the  whitening  filters  of  Sec¬ 
tion  4.1  leads  to  a  conceptually  convenient  and  familiar  realization.  In  particular, 
as  depicted  in  Fig.  4-5,  the  optimal  linear  processor  consists  of  two  stages.  In  the 
first  stage,  the  noisy  observations  r{t)  are  processed  by  a  whitening  filter  with  kernel 
K'w{t,T)  given  by  (4.4b)  to  generate  an  intermediate  white  innovations  process  v{t) 
whose  wavelet  coefficients  are 

<  =  t-- 

In  the  second  stage,  v{t)  is  processed  by  an  innovations  filter  with  kernel 

'Ci(*,T)  =  i:y;cw  C(^)  (4.27) 

m  n  L  . 


to  generate  the  optimal  estimate  x{t)  with  wavelet  coefficients  given  by  (4.25).  This 
innovations-based  implementation  is  a  classical  estimation  structure  [65]. 

In  practice,  good  performance  is  achieved  by  these  estimators  even  in  very  poor 


Figure  4-5;  A  canonic  form  implementation  of  the  optimal  linear  filter  for  estimating 
a  1//  signal  x(t)  from  noisy  observations  r(t).  The  linear  least-squares  filter  is  the 
cascade  of  a  whitening  filter  followed  by  an  innovations  hlter.  The  intermediate 
innovations  process  v{t)  is  stationary  and  white. 

SNR  scenarios.  This  is  not  surprising  given  the  preponderance  of  energy  at  low 
frequencies  (coarse  scales)  in  l//-type  processes.  Let  us,  then,  turn  to  a  quantitative 
analysis  of  the  estimation  error.  First,  we  note  that  because  our  set  of  observations 
is  finite  the  total  mean-square  estimation  error 

j  E  [(i(t)  —  a;(t))^j  dt 

is  infinite.  Nevertheless,  when  we  define 

*(()=  E 

m,n67l 

as  the  best  possible  approximation  to  x{t)  from  the  finite  data  set,  we  can  express 
the  total  mean-square  error  in  our  estimate  with  respect  to  x{t)  as 

s  =  j  E  [(i(f)  —  dt 

=  E 

m,n€7? 

=  E  £|Var(a:"|C)] 

fn,n£7J 


which,  through  routine  manipulation,  reduces  to 


4.3.1  Simulations 


For  the  simulations  of  this  section,  we  synthesize  discrete  samples  of  resolution-limited 
Gaussian  1//  processes  embedded  in  Gaussian  white  noise.  In  general,  we  vary  the 
SNR  of  the  observations  sequence  as  well  as  the  spectral  exponent  7  of  the  underlying 
1//  processes.  We  then  perform  parameter  estimation,  followed  by  signal  estimation, 
using  algorithms  for  the  most  general  scenario,  corresponding  to  the  case  in  which  all 
signal  and  noise  parameters  /?,  are  unknown.  Note  that  by  using  the  estimated 
parameters  in  the  signal  estimation  algorithm,  our  experiments  do  not  allow  us  to 
distinguish  between  those  components  of  signal  estimation  error  due  to  errors  in  the 
estimated  parameter  values  and  those  due  to  the  smoothing  process  itself.  However, 
it  turns  out  that  the  quality  of  the  signal  estimation  is  generally  rather  insensitive  to 
errors  in  the  parameter  estimates  used. 

In  Fig.  4-6,  we  plot  the  SNR  gain  of  our  smoothed  signal  estimates  for  various 
values  of  7  as  a  function  of  the  SNR  of  the  observations  where  the  sequence  length 
is  fixed  to  =  2048.  Again,  results  from  64  trials  were  averaged  to  obtain  the 
error  estimates  shown.  The  SNR  gains  predicted  by  the  total  mean-square  error 
formula  (4.28)  are  also  superimposed  on  each  plot.  As  the  results  indicate,  the  actual 
SNR  gain  is  typically  no  more  than  1  dB  below  the  predicted  gain,  as  would  be 
expected.  However,  under  some  circumstances  the  deviation  can  be  more  than  3 
dB.  Worse,  the  SNR  gain  can  be  negative,  i.e.,  the  net  eflfect  of  smoothing  can  be 
to  increase  the  overall  distortion  in  the  signal.  Such  degradations  in  performance 
are  due  primarily  to  limitations  on  the  accuracy  to  which  the  wavelet  coefficients  at 
coarser  scales  can  be  extracted  via  the  DWT.  In  particular,  they  arise  as  a  result  of 
undesired  effects  introduced  by  modeling  the  data  outside  the  observation  interval  as 
periodic  to  accommodate  the  inherent  data- windowing  problem.  By  contrast,  error 
in  the  parameter  estimates  is  a  much  less  significant  factor  in  these  degradations  at 
reasonably  high  SNR.  The  plots  also  indicate  that  better  gains  are  achieved  for  larger 
^^alues  of  7  for  a  given  SNR.  This  is  to  be  expected  since  for  larger  values  of  7  there 
is  more  signal  energy  at  coarser  scales  and  correspondingly  less  at  finer  scales  where 
the  noise  predominates  and  the  most  attenuation  takes  place. 
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Figure  4-6:  SNR  gain  (dB)  of  the  signal  estimate  as  a  function  of  the  SNR  of  the 
observations.  Both  the  gains  predicted  by  eq.  (4.28)  and  gains  actually  obtained  are 
indicated. 
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We  conclude  this  section  with  a  signal  estimation  example.  In  Fig.  4-7  we  show 
a  segment  of  a  65536-sample  1//  signal,  the  same  signal  embedded  in  noise,  and  the 
signal  estimate.  In  this  example,  the  spectral  exponent  is  7  =  1.67,  and  the  SNR 
in  the  observations  of  0  dB.  The  estimated  spectral  exponent  is  7ml  =  1.66,  and 
the  SNR  gain  of  the  signal  estimate  is  13.9  dB.  As  anticipated,  the  signal  estimate 
effectively  preserves  detail  at  the  coarse  scales  where  the  SNR  was  high,  while  detail 
at  fine  scales  is  lost  where  the  SNR  was  low. 


4.4  Coherent  Detection  in  1//  Noise 

In  this  section  we  consider  the  problem  of  detecting  a  known  signal  of  finite  energy  in 
a  background  of  Gaussian  1//  and  white  noise.  In  general,  the  detection  algorithms 
we  shall  derive  are  applicable  to  many  potential  problems  involving  synchronous  com¬ 
munication  and  pattern  recognition. 

This  rather  fundamental  problem  has  received  some  prior  attention  in  the  litera¬ 
ture.  Indeed,  Barton  and  Poor  considered  the  detection  of  known  signals  in  fractional 
Gaussian  noise  backgrounds  in  [42].  Nevertheless,  our  formulation  has  a  few  distin¬ 
guishing  features,  an  example  of  which  is  that  we  consider  1//  processes  correspond¬ 
ing  to  a  much  broader  range  of  spectral  exponents  7.  However,  perhaps  the  principal 
distinction,  apart  from  one  of  approach,  is  the  inclusion  of  stationary  white  Gaussian 
measurement  noise  in  our  model.  In  general,  this  refinement  improves  the  robust¬ 
ness  properties  of  the  resulting  algorithms  and  precludes  certain  singular  detection 
scenarios.  Furthermore,  it  shall  become  apparent  that  the  wavelet-based  approach 
we  follow  is  not  only  analytically  and  conceptually  convenient,  but  leads  to  practical 
implementation  structures  as  well. 

Let  us  pose  our  detection  problem  in  terms  of  a  binary  hypothesis  test  with  a 
Ney man- Pearson  optimality  criterion.  Given  noisy  observations  r{t),  we  wish  to  de¬ 
termine  a  rule  for  deciding  whether  or  not  a  known  signal  is  present  in  the  observa¬ 
tions.  For  our  test  formulation,  under  hypothesis  Hi  we  observe  a  signal  of  energy 
£*0  >  0  against  a  background  of  Gaussian  1/f  and  white  noise,  while  under  hypothesis 


no 


4 
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Hq  we  observe  only  the  background  noise,  i.e., 


Hi  :  r{t)  =  -f  x{t)  +  w(t) 

Hq  :  r{t)  =  x{t)  +  w{t) 

where  w{t)  is  stationary  white  Gaussian  noise  and  x{t)  is  Gaussian  l//-type  noise, 
and  s(t)  is  a  unit  energy  signal: 

f  s^{t)dt  =  1. 

J  — 00 

We  shall  assume  w{t)  and  a:(t)  to  be  statistically  independent  processes  under  either 
hypothesis.  Let  us  further  assume  our  observations  generally  extend  over  the  infinite 
interval  — oo  <  t  <  oo.  The  problem  is  then  to  design  a  decision  rule  that  maximizes 
the  probability  of  detecting  s{t) 

Pd  =  Pr(decide  Hi  \  Hi  true) 

subject  to  a  constraint  on  the  maximum  allowable  false  alarm  probability 

Pp  =  Pr (decide  Hi  |  Hq  true). 

As  is  well-known,  the  solution  to  this  problem  takes  the  form  of  a  likelihood  ratio  test 
[64]. 

An  equivalent  hypothesis  test  can  be  constructed  in  terms  of  observations  of  the 
respective  wavelet  coefficients 

r  =  (C) 

as 

Hr-K  =  vSc  +  ^-"  +  < 

/fo:C  =  <+< 
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for  —  CX5  <  m  <  oo  and  —  oo  <  n  <  oo.  According  to  our  model,  under  each 
hypothesis,  the  coefficients  and  x”  are  all  statistically  independent,  and  have 

variances  and  cr^/3“'",  respectively. 

In  this  case,  since  joint  distributions  of  the  observations  under  the  respective 
hypotheses  are 


Primir)  =  ]J-—=exp 

m,n  y2ircr^ 

Prlffo(r)  =  n-^=exp 

m,n 


_{cW5c)!' 


the  likelihood  ratio 

Pr|tfi(r) 

Pr|Wo(r) 

can  be  simplified  substantially  to  yield  a  test  of  the  form 


Hi 

^  =  ^  °  (4-29) 

m  n 

Ho 

where  a;  is  the  threshold  of  the  test. 

This  optimal  detector  may  be  realized  using  a  whitening  filter  based  implementa¬ 
tion  as  shown  in  Fig.  4-8.  The  statistic  £  is  constructed  by  processing  both  r(t)  and 
\/Eos(t)  with  a  prewhitening  filter  whose  kernel  is  given  by  {4.4b),  and  correlating 
the  respective  outputs  r,(t)  and  s.{t).  It  is  straightforward  to  verify  this  implemen¬ 
tation:  since  the  prewhitened  signals  r,(t)  and  s.(t)  have  wavelet  coefficients  r^jam 
and  jam’,  respectively,  it  suffices  to  recognize  the  expression  for  £  in  (4.29)  as 

the  inner  product  between  s»{t)j^/Eo  and  r,(t),  which  allows  us  to  rewrite  (4.29)  as 


Hi 

£=  j  r,{t)s,{t)/yf^dt  >  a. 

Ho 
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Figure  4-8:  Canonical  prewbitening  implementation  of  the  optimal  receiver  for  de¬ 
tection  of  a  known  s{t)  in  the  presence  of  both  Gaussian  1  //  and  stationary  white 
Gaussian  noise,  where  Kw(t,  t)  is  the  kernel  of  the  whitening  Slter  for  1/ f -plus-white 
noise. 


This  is,  of  course,  a  canonical  form  receiver  for  optimal  detection  in  the  presence  of 
colored  noise  as  described  in  [64]. 

Let  us  turn  now  to  a  discussion  of  performance  of  this  optimal  receiver.  Via  the 
implementation  of  this  receiver  in  terms  of  the  whitened  observations  i\{t),  we  note 
that  the  performance  is  necessarily  equivalent  to  that  of  an  optimal  detector  for  s,{t) 
in  the  presence  of  stationary  white  Gaussian  noise  of  unit  variance.  Indeed,  if  we 
define  the  performance  index  d  according  to 

<e  i  r  slit)  *  =  £  y;  (££  (4.30) 

m  n 


then 


E[e\Ho]  =  0 
E[i\H,]  =  d^/y/Fo 
Var  {£|ifo}  =  Var  {e\Hi }  =  d'^/Eo. 


Hence,  expressing  our  arbitrary  threshold  in  the  form 


a  = 
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for  some  — oo  <  t]  <  oo,  the  performance  of  the  test  can  be  described  in  terms  of  the 
detection  and  false  alarm  probabilities,  respectively 

where 

Q{x)=^J"e-’/‘‘dv.  (4.32) 

The  familiar  receiver  operating  characteristic  (ROC)  associated  with  such  Gaus¬ 
sian  detection  problems  is  as  shown  in  Fig.  4-9  for  various  values  of  d.  However, 
while  such  performance  projections  are  generally  useful  and  suggest  the  viability  of 
detection  in  l//-type  backgrounds,  there  is  need  for  substantiation  of  these  results 
through  a  comprehensive  set  of  Monte  Carlo  simulations.  Indeed,  it  is  only  through 
such  an  evaluation  that  we  can  ultimately  verify  the  anticipated  insensitivity  of  the 
algorithms  to  our  particular  1//  model. 

In  concluding  this  section,  we  makes  some  brief  remarks  on  the  problem  of  opti¬ 
mum  signal  design  for  use  in  1//- plus- white  backgrounds.  Based  on  our  analysis,  it 
is  apparent  that  we  can  optimize  performance  if  we  choose  s(t),  or  equivalently 
to  maximize  cP  in  (4.30)  subject  to  the  energy  constraint 

/OO  ^  _ 

« =  =  1- 
m  n 

However,  this  signal  optimization  problem  is  not  well-posed.  Indeed,  because  of  the 
spectral  distribution  of  the  background  noise,  the  optimization  will  attempt  to  con¬ 
struct  a  signal  whose  energy  is  at  frequencies  sufficiently  high  that  the  l/f  noise 
is  negligible  compared  to  the  white  component.  Consequently,  to  preclude  the  gen¬ 
eration  of  an  arbitrarily  high  frequency  signal,  generally  some  form  of  bandwidth 
constraint  is  necessary.  For  an  example  of  how  this  is  accommodated  in  a  communi¬ 
cations  scenario,  see  [66]. 
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4.5  Discriminating  Between  1/f  Signals 


In  this  section,  we  consider  the  ability  of  an  optimal  Bayesian  detector  to  discriminate 
between  Gaussian  1/f  processes  of  distinct  parameters  in  a  background  of  station¬ 
ary  white  Gaussian  noise.  The  signal  classification  algorithms  we  derive  are  useful 
in  a  variety  of  potential  applications.  The  problem  of  distinguishing  1/f  processes 
is,  of  course,  very  closely  related  to  the  parameter  estimation  problem  treated  in 
Section  4.2.  Indeed,  parameter  estimation  can  be  viewed  as  distinguishing  among 
an  arbitrarily  large  number  of  1//  processes  with  incrementally  different  parameters. 
Nevertheless,  as  we  shall  see,  approaching  the  problem  from  a  detection  perspective 
affords  a  number  of  new  and  useful  insights. 

It  is,  again,  convenient  to  formulate  our  problem  in  terms  of  a  binary  hypothesis 
test  in  which  under  each  hypothesis  we  have  noisy  observations  r{t)  of  distinct  1  // 
signals.  Specificallj’^,  we  have  as  our  two  hypotheses 

Ho  :  r{t)  =  x{t)  +  w{t)  (4.33a) 

H\'.r{t)  =  x{t)  +  w{t)  (4.33b) 

where  x(t)  and  x(t)  are  Gaussian  1/f  processes'^  with  distinct  parameters  and  w(t)  is 
a  white  measurement  noise,  statistically  independent  of  x(t)  or  x{t),  whose  variance  is 
the  same  under  both  hypotheses.  For  this  test  we  will  develop  a  minimum  probability 
of  error  (Pr(£))  decision  rule  under  the  assumption  of  equally  likely  hypotheses. 

Once  again,  our  optimum  receiver  is  best  developed  and  analyzed  in  the  wavelet 
domain.  Rewriting  the  hypothesis  test  in  terms  of  the  corresponding  wavelet  coeffi¬ 
cients  as 


Ho-.C 


x'/Z  +  W 


m 

n 


Hi: 


'  n  ~  -^n 


+  W 


n  ’ 


‘‘We  use  the  notation  '  and  "  to  distinguish  the  1/f  processes  and  their  respective  parameters 
under  the  two  hypotheses.  These  symbols  should  not  be  confused  with  differentiation  operators,  for 
which  we  have  generally  reserved  the  notation  '  and 
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we  model  the  r™  under  each  hypothesis  as  a  collection  of  zero-mean  statistically 
independent  Gaussian  random  variables  with  variances 

Var{C|//„)  =  dl^a'^r-'  +  al 
Var{C|/f,}  =  cl  =  a^ir’"  +  al 

where 

=  2^ 
p  =  T. 

In  our  derivation  we  shall  assume  that,  in  general,  only  a  finite  collection  of 
observation  coefficients  of  the  form 

r  =  {C  €  7e}  =  {C,m  eM,n  eN'im)}, 

where  M.  and  A/"(m)  are  as  defined  in  (4.7),  are  available.  In  fact,  as  we  shall  see, 
the  problem  turns  out  to  be  singular  (i.e.,  perfect  detection  is  achievable)  if  complete 
observations  over  the  infinite  interval  are  available.  In  our  simulations,  we  shall 
assume  our  observation  set  TZ  to  be  of  the  particular  form  (4.9),  which  corresponds 
to  the  collection  of  coefficients  generally  available  from  time-  and  resolution-limited 
observations  of  r{t)  via  a  DWT  algorithm  as  discussed  in  Section  2.2.4 

The  likelihood  ratio  test  for  this  problem  can  be  simplified  to  a  test  of  the  form 

Hi 

>  0  (4.35) 

Ho 

where  the  cr^  are,  again,  the  sample- variances  defined  via  (4.12)  which  summarize 
the  aspects  of  interest  in  the  data.  It  is  straightforward  to  show  that  this  test  can 
be  implemented  in  the  canonical  form  shown  in  Fig.  4-10.  Here  the  observations  r{t) 


(4.34a) 

(4.34b) 
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Figure  4-10:  A  canonical  form  implementation  of  the  optimal  receiver  for  discriminat¬ 
ing  between  Iff  models  with  distinct  parameters  based  on  noisy  observations  r(t). 


are  processed  by  1 //-plus- white  whitening  filters  corresponding  to  each  hypothesis, 
for  which  the  respective  kernels  are 

m  n 

m  n 

Consequently,  only  one  of  the  residual  processes  v(t)  and  v(t)  is  white,  depending  on 
which  hypothesis  is  true.  To  decide  between  the  two  hypotheses,  the  receiver  com¬ 
putes  the  difference  in  energy  in  the  two  residuals  and  compares  it  to  the  appropriate 
threshold. 

Although  the  detection  problem  is  Gaussian,  it  is  apparent  that  the  the  log- 
likelihood  £  is  not  conditionally  Gaussian  under  either  hypothesis.  Consequently, 
evaluating  the  performance  of  such  receivers  is  rather  difficult  in  general.  Neverthe¬ 
less,  it  is  possible  to  obtain  approximate  performance  results  by  exploiting  a  procedure 
described  in  [64]  based  upon  the  use  of  the  Chernoff  bound.  Specifically,  defining 

m(s)  =  E  [e-'|//o]  , 

for  an  arbitrary  real  parameter  s,  we  can  bound  the  performance  of  our  optimal 
detector  according  to 

Pr(e)  <  (4.36) 
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where  5*  is  the  parameter  value  yielding  the  best  possible  bound,  i.e., 


s*  =  argmm/u(s). 

When  there  are  sufficiently  many  observations  to  justify  modeling  £  as  Gaussian 
via  a  central  limit  theorem  (CLT)  argument,  we  can  also  obtain  the  following  asymp¬ 
totic  expression  for  the  error  probability 


Pr(e) 


2s,(l  —  s,)o^7r//^ 


(4.37) 


which  is  a  more  optimistic  and  accurate  estimate  of  the  achievable  performance  [64] 
In  our  case,  fj.{s)  and  its  first  two  derivatives  are  given  by 

M  \  A^Ml^ln^-  ln  s^  +  (l-s)  I  (4.38a) 

^  m&M  (  I  °m  J  J 


i  Y.  JV(m)  Inif 

^  mex 


S^  +  (1-S) 


(4.38b) 


(4.38c) 


It  is  generally  not  possible  to  derive  a  closed  form  expression  for  the  minimum  value  of 
fx{s)  via  either  (4.38a)  or  (4.38b)  for  this  asymmetric  detection  problem.  Fortunatefy, 
though,  a  numei'ical  optimization  is  reasonable:  it  suffices  to  consider  a  numerical 
search  over  values  of  s  within  a  limited  range.  Indeed,  since 

H{0)  =  nil)  =  0, 

and  since  from  (4.38c)  we  have  that  //(s)  is  convex 


n''is)  >  0, 
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it  follows  that  the  minimum  of  /x(s)  can  be  found  in  the  range  0  <  s  <  1. 

4.5.1  Simulations 

In  this  section,  we  obtain,  via  (4.36)  and  (4.37),  numerical  estimates  of  the  probability 
of  error  performance.  For  our  scenario,  we  assume  that  coefficients  are  available 
in  the  range  (4.9)  consistent  with  what  could  be  extracted  from  N  =  2^  samples  of 
time-  and  resolution-limited  observations  via  a  DWT  algorithm.  In  our  simulations, 
we  consider  the  ability  of  the  optimal  receiver  to  distinguish  between  1//  processes  of 
different  spectral  exponents  7  (or,  equivalently,  fractal  dimensions  D,  or  self-similarity 
parameters  H).  In  particular,  we  do  not  consider  the  capabilities  of  the  algorithms 
to  discriminate  on  the  basis  of  variance  differences.  Consequently,  in  all  our  tests,  we 
choose  the  variance  parameters  and  such  that  the  variance  of  the  observations 
is  identical  under  either  hypothesis. 

In  our  first  set  of  simulations,  the  bound  (4.36)  is  used  as  an  estimate  of  the 
probability  of  error  performance  of  an  optimal  detector  in  discriminating  between 
two  equal- variance  1//  processes  whose  spectral  exponents  differ  by  A7  based  on 
noisy  observations  of  length  N  corresponding  to  a  prescribed  SNR.  In  the  tests, 
three  different  spectral  exponent  regimes  are  considered,  corresponding  to  7  =  0.33, 
7  =  1.00,  and  7  =  1.67. 

In  Fig.  4-11,  performance  is  measured  as  a  function  of  SNR  for  noisy  observa¬ 
tions  of  length  N  =  128  and  a  parameter  separation  A7  =  0.1.  Note  that  there  is 
a  threshold  phenomenon;  above  a  certain  7-dependent  SNR,  Pr(£)  drops  dramati¬ 
cally.  Moreover,  the  threshold  is  lower  for  larger  values  of  7.  This  is  to  be  expected 
since  larger  values  of  7  correspond  to  an  effective  Iff  power  spectrum  that  is  in¬ 
creasingly  peaked  at  the  u;  =  0,  so  that  a  correspondingly  greater  proportion  of  the 
total  signal  power  is  not  masked  by  the  white  observation  noise.  Beyond  this  thresh¬ 
old  performance  saturates  as  the  data  is  essentially  noise-free.  However,  note  that 
there  is  crossover  behavior;  at  SNR  values  above  the  thresholds,  better  performance 
is  obtained  for  smaller  values  of  7.  In  subsequent  tests,  we  restrict  our  attention  to 
performance  in  this  high  SNR  regime. 
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In  Fig.  4-12,  performance  is  plotted  as  a  function  of  the  number  of  samples  N  of 
observed  data  corresponding  to  an  SNR  of  20  dB  and  hypotheses  whose  parameter 
separation  is  A7  =  0.1.  In  this  case,  there  is  thresholding  behavior  as  well.  For  data 
lengths  beyond  a  critical  order-of-magnitude  we  get  strongly  increasing  performance 
as  a  function  of  data  length.  Again,  because  we  are  in  the  high  SNR  regime,  we 
observ^e  that  the  best  performance  is  achieved  for  the  smallest  values  of  7. 

Finally,  in  Fig.  4-13,  performance  is  plotted  as  a  function  of  the  separation  between 
the  two  hypotheses — specifically,  the  difference  between  the  spectral  parameters  for 
noisy  observations  of  length  N  =  128  corresponding  to  an  SNR  of  20  dB.  As  we  would 
expect,  the  results  illustrate  that  the  larger  the  distinction  between  the  hj'potheses, 
the  better  the  performance  achievable  by  the  receiver.  Again,  as  we  are  in  the  high 
SNR  regime,  we  find  that  the  best  performance  is  achieved  for  the  smallest  values  of 
7- 

Whenever  the  probability  of  error  is  low — i.e.,  either  when  the  SNR  is  high,  large 
data  lengths  are  involved,  or  the  h3'^potheses  are  well- separated — it  turns  out  that  the 
CLT-based  approximation  (4.37)  represents  a  more  optimistic  estimate  of  performance 
than  does  (4.36).  However,  in  high  Pr(£)  scenarios,  (4.36)  constitutes  a  more  useful 
measure  of  system  performance  than  does  (4.36).  This  behavior  is  illustrated  in 
Figs.  4-14,  4-15,  and  4-16  for  hypotheses  in  the  7  =  1  regime.  Note  that  onlj^  at 
sufficiently  high  SNR,  data  lengths,  and  parameter  separations  does  the  CLT-based 
approximation  actually  yield  a  Pr(£:)  estimate  that  is  below  the  bound.  We  cannot, 
of  course,  assess  whether  the  CLT-based  approximation  is  overly  optimistic  in  the 
high  SNR  regime.  In  general,  we  can  only  expect  the  estimate  to  be  asjmiptotically 
accurate  as  N  —*■  00.  Nevertheless,  the  fact  that  the  rate  of  change  of  Pr(£)  with 
respect  to  SNR,  data  length  N,  and  parameter  separation  A7  has  a  similar  form  for 
both  the  bound  and  the  approximation  gives  us  additional  confidence  that  the  earlier 
plots  of  Pr(£)  do  suggest  the  correct  form  of  the  functional  dependence  upon  these 
parameters. 

It  is  important  to  emphasize  once  again  that  the  simulations  of  this  section  consti¬ 
tute  a  highly  preliminary  performance  study.  Indeed,  any  comprehensive  evaluation  of 
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Figure  4-12:  Optimal  discriminator  performance  as  a  function  of  the  number  of  sam¬ 
ples  N  of  noisy  observations,  as  estimated  by  the  Chernoff  bound.  The  symbols  □, 
A  and  O  correspond  to  actual  estimates;  the  lines  are  provided  as  visual  aides  only 
in  this  case. 
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Figure  4-14:  Optimal  discriminator  performance  as  a  function  of  SNR,  as  estimated 
via  both  the  Chernoff-bound  (4.36)  and  the  CLT-based  approximation  (4.37). 


Pr(e) 


10°  10^  10^  10^ 

Data  Length  N 

Figure  4-15:  Optimal  discriminator  performance  as  a  function  of  the  number  of  sam¬ 
ples  of  observed  data,  as  estimated  via  both  the  Chernoff-bound  (4.36)  and  the  CLT- 
based  approximation  (4.37).  The  A  symbols  correspond  to  actual  estimates;  the  lines 
are  provided  as  visual  aides  only. 
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Figure  4-16:  Optimal  discriminator  performance  as  a  function  of  the  parameter  sep¬ 
aration  A7,  as  estimated  via  both  the  Chernoff-bound  (4.36)  and  the  CLT-based 
approximation  (4.37). 


the  system  must  ultimately  involve  a  series  of  Monte  Carlo  simulations  using  synthetic 
and  real  data.  Such  tests  are  critical  to  exploring  issues  pertaining  to  robustness  of 
the  algorithms  to  modeling  errors.  Nevertheless,  the  numerical  results  presented  here 
lend  some  valuable  insights  into  the  performance  potential  and  anticipated  behavior 
of  optimal  discriminators  for  l//-type  processes. 

Before  concluding  this  section,  we  consider  a  potentially  useful  and  practical  re¬ 
finement  of  the  optimal  discrimination  problem.  There  are  a  number  of  application 
contexts  in  which  we  would  be  more  interested  in  distinguishing  1  //  processes  strictly 
on  the  basis  of  their  spectral  exponents,  fractal  dimensions,  or  self-similarity  parame¬ 
ters.  This  would  correspond  to  a  hypothesis  test  (4.33)  in  which  and  would 

be  unwanted  parameters  of  the  problem.  In  this  case,  a  solution  could  be  obtained 
using  a  generalized  likelihood  ratio  test  [64]  of  the  form 
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(4.39) 


In  general,  expressions  for  the  maxima  involved  in  the  construction  of  the  likelihood 
function  of  (4.39)  cannot  be  obtained  in  closed  form.  However,  a  practical  imple¬ 
mentation  of  this  receiver  could  potentially  exploit  an  EM  algorithm  of  the  general 
type  developed  in  Section  4.2.  In  terms  of  performance,  we  would  anticipate  that, 
in  general,  it  would  only  be  possible  to  adequately  evaluate  such  a  receiver  through 
Monte  Carlo  simulations. 


4.6  Outstanding  Issues 

A  number  of  outstanding  issues  remain  to  be  addressed  concerning  wavelet-based 
representations  of  1//  processes  and  their  use  in  signal  processing.  As  an  example, 
while  we  rely  on  both  theoretical  and  empirical  evidence  that  the  wavelet  coefficients 
of  1//  processes  are  decorrelated,  we  recall  that  the  theoretical  justification  of  this 
analysis  result  was  rather  weak.  An  important  contribution  would  be  to  establish 
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both  tight  bounds  on  the  correlation  between  wavelet  coefficients  and  conditions  on 
the  wavelet  basis  underwhich  such  bounds  are  valid. 

A  complementary  issue  concerns  our  synthesis  result,  which  established  that  wave¬ 
let  expansions  in  terms  of  uncorrelated  random  variables  give  rise  to  nearly- 1//  spec¬ 
tra.  An  interesting  and  potentially  important  question  concerns  whether  there  exist 
wavelets  giving  rise  to  an  exactly-1//  spectrum.  While  the  answer  may  well  be  neg¬ 
ative,  perhaps  sequences  of  dyadic  or  non-dyadic  wavelets  can  chosen  to  approach  a 
1  //  spectrum  arbitrarily  closely. 

Another  issue  pertains  to  the  unusual  data-windowing  problem  inherent  in  the 
wavelet  decomposition.  In  the  experiments  described  in  this  work,  we  have  avoided 
the  problem  by  modeling  the  data  as  periodic  outside  the  finite  observation  interval 
during  computation  of  the  DWT.  However,  this  leads  to  a  number  of  rather  unde¬ 
sirable  effects,  some  of  which  manifested  themselves  in  the  smoothing  simulations  as 
we  noted.  More  effective  approaches  to  accommodating  observations  on  the  finite 
interval  need  to  be  developed. 

Finally,  a  number  of  interesting,  straightforward  and  useful  extensions  to  this 
work  are  suggested  by  the  approaches  described  here.  Specifically,  the  problem  of 
distinguishing  and  isolating  two  superimposed  fractal  signals  is,  in  principle,  readily 
solved  by  the  methods  of  this  work.  In  addition,  the  separable  extension  of  the 
results  presented  herein  to  two  and  higher  dimensions  is  likewise  straightforward.  In 
each  case,  we  anticipate  that  a  number  of  powerful  yet  practical  algorithms  can  be 
developed. 

Although  a  general  treatment  of  incoherent  detection  problems  involving  1  //  back¬ 
grounds  is  beyond  the  scope  of  this  thesis,  a  variety  of  such  problems  can  be  addressed 
using  some  straightforward  extensions  to  the  approaches  of  the  chapter.  In  general, 
to  model  these  scenarios  we  can  consider  the  detection  of  signals  known  to  within  a 
collection  of  parameters.  Optimal  detection  schemes  for  such  scenarios  typically  in¬ 
volve  generalized  likelihood  ratio  tests  in  which  there  are  both  parameter  estimation 
and  signal  detection  components.  An  example  of  such  an  extension  was  outlined  at 
the  end  of  Section  4.5.1. 
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Chapter  5 


Deterministically  Self-Similar 
Signals 


Signals  x{t)  satisfying  the  deterministic  scale- invariance  property 

x{t)  =  a~^x(ot)  (5.1) 

for  all  a  >  0,  are  generally  referred  to  in  mathematics  as  homogeneous  functions  (of 
degree  H).  Homogeneous  functions  can  be  regular  or  nearly  so,  for  example  x{t)  =  1 
or  x{t)  =  ii(t),  or  they  can  be  generalized  functions,  such  as  x{t)  =  6{t).  In  any  case, 
as  shown  by  Gel’fand  [67],  homogeneous  functions  can  be  parameterized  with  only 
a  few  constants.  As  such,  they  constitute  a  rather  limited  class  of  signal  models  in 
many  contexts. 

A  comparatively  richer  class  of  signal  models  is  obtained  by  considering  waveforms 
which  are  required  to  satisfy  (5.1)  only  for  values  of  a  that  are  integer  powers  of 
two.  This  broader  class  of  homogeneous  signals  then  satisfy  the  dyadic  self-similarity 
property 

x(t)  =  2-^^x{2H)  (5.2) 

for  all  integers  k.  It  is  this,  more  general,  family  of  homogeneous  signals  of  degree  H 
whose  properties  and  characterizations  we  study  in  this  chapter.  When  there  is  risk 
of  confusion  in  our  subsequent  development,  we  will  denote  signals  satisfying  (5.1)  as 
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strict-sense  homogeneous,  and  signals  satisfying  (5.2)  as  wide-sense  homogeneous. 

Homogeneous  signals  constitute  an  interesting  and  potentially  important  class  of 
signals  for  use  in  a  number  of  communications-based  applications.  In  Chapter  6  we 
shall  see  that  when  we  develop  a  strategy  for  embedding  information  into  a  waveform 
“on  all  time  scales,”  the  resulting  waveforms  are  homogeneous  signals.  As  a  conse¬ 
quence  of  intrinsic  self-similarity,  these  modulated  waveforms  have  the  property  that 
an  arbitrarily  short  duration  time-segment  is  sufficient  to  recover  the  entire  wave¬ 
form,  and  hence  the  information,  given  adequate  bandwidth.  Likewise  an  arbitrarily 
low-bandwidth  approximation  to  the  waveform  is  sufficient  to  recover  the  undistorted 
waveform,  and  hence  the  information,  given  adequate  duration.  Furthermore,  we  will 
see  that  these  homogeneous  waveforms  have  spectral  characteristics  very  much  like 
those  of  1//  processes,  and,  in  fact,  have  fractal  properties  as  well. 

Collectively,  such  properties  make  this  modulation  scheme  an  intriguing  paradigm 
for  communication  over  highly  unreliable  channels  of  uncertain  duration,  bandwidth, 
and  SNR,  as  well  as  in  a  variety  of  other  contexts.  We  shall  explore  these  and  other 
issues  in  the  next  chapter.  In  the  meantime,  we  turn  our  attention  to  developing 
a  convenient  mathematical  framework  for  characterizing  homogeneous  signals  that 
shall  prove  useful  in  the  sequel. 

While  all  non-trivial  homogeneous  signals  have  infinite  energy,  and  many  have 
infinite  power,  there  are  nevertheless  some  such  signals  with  which  one  can  associate 
a  generalized  1  / /-like  Fourier  transform,  and  others  with  which  one  can  associate  a 
generalized  l//-like  power  spectrum.  We  distinguish  between  these  two  classes  of 
homogeneous  signals  in  our  subsequent  treatment,  denoting  them  energy-dominated 
and  power-dominated  homogeneous  signals,  respectively.  We  begin  our  theoretical 
development  by  more  rigorously  defining  the  notion  of  an  energy-dominated  homo¬ 
geneous  signal,  and  constructing  some  Hilbert  space  characterizations.  This  will  lead 
to  some  useful  constructions  for  orthonormal  “self-similar  bases”  for  homogeneous 
signals.  It  will  become  apparent  that,  as  in  the  case  of  statistically  self-similar  1//- 
type  processes,  orthonormal  wavelet  basis  expansions  constitute  natural  and  efficient 
representations  for  these  signals  as  well. 
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5.1  Energy-Dominated  Homogeneous  Signals 


Our  definition  of  an  energy- dominated  homogeneous  signal  is  reminiscent  of  the  one 
we  proposed  for  1//  processes  in  Section  3.1.  Specifically, 

Definition  5.1  A  wide-sense  homogeneous  signal  x{t)  is  said  to  be  energy-dominated 
if  when  ar(t)  is  filtered  by  an  ideal  bandpass  filter  with  frequency  response 


Bo{u;) 


1  TT  <  lci;|  <  27r 
0  otherwise 


(5.3) 


the  resulting  signal  Xo{t)  has  finite- energy,  i.e., 

/  i’oiO  dt  <  oo. 

J  —  OO 

As  with  1//  processes,  the  choice  of  passband  edges  at  tt  and  27r  in  our  definition  is 
somewhat  arbitrary.  In  particular,  substituting  in  the  definition  any  passband  that 
does  not  include  a;  =  0  or  =  oo  but  includes  one  entire  frequency  octave  leads 
to  precisely  the  same  class  of  signals.  Nevertheless,  our  particular  choice  is  both 
sufficient  and  convenient. 

We  remark  that  the  class  of  energy-dominated  homogeneous  signals  includes  both 
reasonably  regular  functions,  such  as  the  constant  x{t)  —  1,  the  ramp  x{t)  =  t,  the 
time- warped  sinusoid  x{t)  =  cos[27r  log2 1] ,  and  the  unit  stejD  function  x{t)  =  u{t),  as 
well  as  singular  functions,  such  as  x{t)  =  6{t)  and  its  derivatives.  However,  although 
we  will  not  always  be  able  to  actually  “plot”  signals  of  this  class,  we  will  be  able  to 
suitably  characterize  such  functions  in  some  useful  ways.  Let  us  begin  by  using 
to  denote  the  collection  of  all  energy-dominated  homogeneous  signals  of  degree  H. 
The  following  theorem  allows  us  to  interpret  the  notion  of  spectra  for  such  signals. 
A  straightforward  but  detailed  proof  is  provided  in  Appendix  D.l. 
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Theorem  5.2  When  an  energy-dominated  homogeneous  signal  x{t)  is  filtered  by  an 
ideal  bandpass  filter  with  frequency  response 


Note  that  since  X(a;)  in  this  theorem  does  not  depend  on  u!i  or  uu,  this  function 
may  be  interpreted  as  the  generalized  Fourier  transform  of  x{t).  Furthermore,  (5.6) 
implies  that  the  generalized  Fourier  transform  of  signals  in  obeys  a  l//-like 
(power-law)  relationship,  viz., 

I  Y((j)| - 1 — . 

However,  we  shall  continue  to  reserve  the  term  “1//  process”  or  “1//  signal”  for  the 
statistically  self-similar’  random  processes  defined  in  Chapter  3. 

We  also  remark  that  X{u)  does  not  uniquely  specify  x(t)  6  E^,  i.e.,  the  mapping 

x(t)  < — *■  X(u) 

is  not  one  to  one.  As  an  example,  x(t)  =  1  and  x(t)  =  2  are  both  in  E^  for  H  =  0, 
yet  both  have  Xim)  =  0  for  >  0.  In  order  to  accommodate  this  pathological 
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behavior  in  our  subsequent  theoretical  development  we  shall  exploit  the  notion  of  an 
equivalence  class.  In  particular,  we  shall  consider  all  signals  having  a  common  X(u;) 
as  equivalent  or  indistinguishable. 

Indirectly,  Theorem  5.2  suggests  that  an  energy-dominated  homogeneous  signal 
x{t)  has  a  convenient  representation  in  terms  of  the  ideal  bandpass  wavelet  basis.  In 
particular,  if  we  sample  the  output  xo(t)  of  the  filter  in  Definition  5.1  at  unit  rate  we 
obtain  the  sequence 

/OO  fOO  ^ 

x{t)  b(){t  ~  n)dt  =  /  x{t)  dt  = 

•OO  ^  — OO 

where  rp{t)  is  the  ideal  bandpass  wavelet  whose  frequency  response  is  given  by  (2.7). 
Furthermore,  the  self-similarity  of  a;(t)  according  to  (5.2)  implies 

/•CO 

7—00 

where,  as  in  earlier  chapters,  /?  is  defined  by 

13  =  2^^+^  =  2\  (5.7) 

Consequently,  using  the  orthonormal  wavelet  synthesis  formula  (2.5a),  x{t)  can  be 
expressed  as^ 

=  (5.8) 

m  n 

from  which  we  see  that  x{t)  is  completely  specified  in  terms  of  g[7r].  We  term  g[jr] 
a  generating  sequence  for  x{t)  since,  as  we  shall  see,  this  representation  leads  to 
techniques  for  synthesizing  useful  approximations  to  homogeneous  signals  in  practice. 

More  generally,  as  we  shall  see, (5. 8)  is  a  useful  expansion  for  a  broad  family  of 
homogeneous  functions.  However,  a  homogeneous  function  x{t)  defined  by  (5.8)  is, 

' Henceforth,  we  shall  assume  in  this  chapter,  and  in  the  associated  appendix,  that  all  summations 
over  m  and  n  extend  from  —  oo  to  oo  unless  otherwise  noted. 
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specifically,  energy-dominated  if  and  only  if  q[n]  has  finite  energy,  i.e., 

Y,q^[n]  <  oo. 

n 

This  follows  from  the  fact  that  since  xo(t)  in  Definition  5.1  has  the  orthonormal 
expansion 

^o(f)  =  E#]’^n(0  (5.9) 

n 

we  have 

/OO  ^  ^ 

£^(t)dt  (5.10) 

•<»  n 

In  turn,  this  observation  implies  that  we  may  use  (5.10)  to  define  the  following  norm 
on  E^: 

/OO 

xl{t)dt  =  (5-11) 

-°o  n 

This  is  a  valid  norm  provided  we  adopt  the  convention  that  two  homogeneous  func¬ 
tions  f{t)  and  g{t)  in  are  equivalent  if  their  generalized  Fourier  transforms  F(u) 
and  G(a;),  respectively,  are  identical  on  0  <  a;  <  oo.  In  turn,  this  implies  that  f{t) 
and  g{t)  are  equivalent  if  they  differ  by  a  homogeneous  function  whose  frequencj'’ 
content  is  concentrated  at  the  origin  a;  =  0.  In  the  case  that  H  is  an  integer  A:  >  0, 
this  would  correspond  to 

/(()  -  g(t)  =  Ct‘,  (5.12) 

for  an  arbitrary  constant  C.  Whether  (5.12)  characterizes  all  equivalent  functions  for 
H  =  k  >  0,  and  whether  there  are  equivalent  functions  for  other  values  of  H  are 
open  questions.  In  any  case,  with  this  notion  of  an  equivalence  class,  E^  is,  in  fact, 
complete  with  respect  to  the  norm  (5.11)  and,  hence,  constitutes  a  Banach  space. 
This  is  an  immediate  consequence  of  the  isomorphism  between  E^  and  1^(Z). 

More  generally,  we  may  define  an  inner  product  between  two  energy-dominated 
homogeneous  signals  f{t)  and  g(t),  whose  generating  sequences  under  the  bandpass 


136 


basis  are  a[n]  and  6[n],  respectively,  as 

if,  9)^  =  /  /o(0  gait)  dt  =  Yl  (5-13) 

where  /o(t)  and  yo(0  outputs  of  the  bandpass  filter  (5.3),  and  where  the  last 

equality  is  a  consequence,  again,  of  the  orthonormality  of  the  expansion  (5.9).  With 
this  inner  product,  the  induced  norm  is,  of  course,  (5.11).  Since  is  complete, 
therefore  constitutes  a  Hilbert  space.  Furthermore,  one  can  readily  construct 
“self-similar”  bases  within  E^.  Indeed,  (5.8)  immediately  provides  an  orthonormal 
basis  for  E^ .  In  particular,  for  any  x{t)  €  E^ ,  we  have  the  synthesis  and  analysis 
pair 


^(0  =  (5.14a) 

n 

q[n]  =  (5.14b) 

where  one  can  easily  verify  that  the  basis  functions 

=  (5-15) 

m 

are  self-similar,  orthogonal,  and  have  unit  norm. 

The  fact  that  the  ideal  bandpass  basis  is  unrealizable  means  that  (5.14a)  is  not 
a  practical  mechanism  for  synthesizing  or  analyzing  homogeneous  signals.  However, 
as  we  will  show,  more  practical  wavelet  bases  are  equally  suitable  for  defining  an 
inner  product  for  the  Hilbert  space  E^ .  Specifically,  we  next  show,  that  a  broad 
class  of  wavelet  bases  can  be  used  to  construct  such  inner  products,  and  that,  as 
a  consequence,  some  highly  efficient  algorithms  arise  for  processing  homogeneous 
signals. 
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In  order  to  determine  which  orthonormal  wavelet  bases  can  be  used  to  define  inner 
products  for  ,  we  must  determine  for  which  wavelets  ^(t)  the  homogeneous  signal 

=  (5.16) 

m  n 

is  energy-dominated  when  q[n]  has  finite  energy  and,  simultaneously,  the  sequence 

/OO 

“OO 

has  finite  energy  when  x{t)  is  an  energy-dominated  homogeneous  signal.  In  other 
words,  we  seek  conditions  on  a  wavelet  basis  such  that  q\n]  €  1^(Z)  will  be  isomorphic 
to  .  Our  main  result  is  a  corollary  of  the  following  theorem.  A  proof  of  this 
theorem  is  provided  in  Appendix  D.2. 

Theorem  5.3  Consider  an  orthonormal  wavelet  basis  such  that  ip{t)  has  R  vanishing 
moments  for  some  integer  R  >  1,  i.e., 

=  0,  r  =  0,l,...,i?- 1.  (5.17) 

Then  when  any  homogeneous  signal  x{t)  whose  degree  H  is  such  that  7  =  2H  -}-  1 
satisfies  0  <  7  <  2R  is  filtered  by  an  LTI  system  with  impulse  response  if{—t),  the 
resulting  process  xo{t)  has  finite  energy  if  and  only  if  x{t)  is  energy-dominated. 

When  the  output  XQ{t)  of  the  filter  in  Theorem  5.3  is  sampled  at  unit  rate,  we 
obtain  the  sequence  of  wavelet  coefficients 

/OO 

x{t)ipl(t)dt  =  xl.  (5.18) 

*00 

Furthermore,  from  the  self-similarity  of  x(t)  we  have  also 

TOO 

r”'7W  =  /  x(()c(()*  =  <.  (5.19) 

^  —  00 
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Clearly  then,  associated  with  each  V'(t)  satisfying  (5.17)  is  a  generating  sequence  q[n] 
for  the  homogeneous  signal  a:(t).  Also,  again  because  of  the  orthogonality  of  V’CO 
with  respect  to  its  unit  translates,  we  have 

J-oo  „ 

Combining  this  result  with  that  of  Theorem  5.3  allows  us  to  immediately  deduce  our 
main  result; 

Corollary  5.4  For  any  wavelet  basis  satisfying  the  conditions  of  Theorem  5.3,  the 
homogeneous  signal  x{t) 


(5-20) 

m  n 

is  energy-dominated  (i.e.,  x{t)  €  )  if  and  only  if  q[n]  has  finite  energy  (i.e., 

sH  e 

This  corollary  implies  that  we  may  choose  for  our  Hilbert  space  from  among  a 
large  number  of  inner  products  whose  induced  norms  are  all  equivalent.  In  particular, 
for  any  wavelet  with  sufficient  vanishing  moments,  we  may  define  the  inner 
product  between  two  functions  f{t)  and  g{t)  in  E^  whose  generating  sequences  are 
a[n]  and  6[n],  respectively,  as 


=  Y^o.[n]b[n].  (5.21) 

n 

It  is  important  to  emphasize  that  this  collection  of  inner  products  is  not  necessarily 
exhaustive.  Even  for  wavelet-based  inner  products.  Corollary  5.4  asserts  only  that  the 
vanishing  moment  condition  is  sufficient  to  ensure  that  the  inner  product  generates 
an  equivalent  norm.  It  is  conceivable,  however,  that  the  vanishing  moment  condition 
is  not  a  necessary  condition. 
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In  any  case,  each  wavelet- based  inner  product  leads  immediately  to  an  orthonor¬ 
mal  self-similar  basis  for  E^\  if  x{t)  €  then 

(5.22a) 

n 

q[n]  =  (5.22b) 

where,  again,  the  basis  functions 

<>"(<)  =  (5-23) 

m 

are  all  self-similar,  mutually  orthogonal,  and  have  unit  norm.  For  the  case  if  =  0, 
Fig.  5-1  depicts  the  self-similar  basis  functions  ^^(i),  Bg(t),  and  Bj(t)  correspond¬ 
ing  to  the  Daubechies  5th-order  compactly-supported  wavelet  basis.  These  functions 
were  generated  by  evaluating  the  summation  (5.23)  over  a  large  but  finite  range  of 
scales  m.  We  emphasize  that  ^[n]  is  only  a  unique  characterization  of  x(t)  when  we 
associate  it  with  a  particular  choice  of  wavelet  In  general,  every  different  wavelet 
decomposition  of  x(t)  will  yield  a  different  q[n],  though  all  will  have  finite-energy. 
For  an  arbitrary  non-homogeneous  signal  x{t),  the  sequence 

q[n]  =  {x,e”)^ 

defines  the  projections  of  x{t)  onto  ,  so  that 

x{i)  =  f  9[n]  (t)  dt 

represents  the  closest  homogeneous  signal  to  ar(t) 

x{t)  =  arg  min  ||^  -  a;||^ 

with  respect  to  the  induced  norm  |1  •  H^.  In  Chapter  6,  it  will  be  apparent  how  such 
projections  arise  rather  naturally  in  treating  problems  of  estimation  with  homoge- 


140 


Figure  5-1;  The  self-similar  basis  functions  d^{t),  9Q{t),  and  6j{t)  of  an 

orthonormal  basis  for  ,  H  =  0. 
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neous  signals. 

Finally,  we  remark  that  wavelet-based  characterizations  give  rise  to  a  convenient 
expression  for  the  generalized  Fourier  transform  of  an  energy-dominated  homogeneous 
signal,  x{t).  In  particular,  if  we  take  the  Fourier  transform  of  (5.16)  we  get,  via  some 
routine  algebra, 

(5.24) 

m 

where  Q(u))  is  the  Fourier  transform  of  ^[n].  This  spectrum  is  to  be  interpreted  in 
the  sense  of  Theorem  5.1,  i.e.,  X{uj)  defines  the  spectral  content  of  the  output  of  a 
bandpass  filter  at  every  frequency  u)  within  the  passband. 

In  summary,  we  have  shown  that  a  broad  class  of  wavelet-based  norms  are  equiva¬ 
lent  for  in  a  mathematical  sense,  and  that  each  of  these  norms  is  associated  with 
a  particular  inner  product.  This  leads  one  to  speculate  whether  every  equivalent 
norm  for  can  be  associated  with  a  wavelet  basis,  in  which  case  the  basis  functions 
associated  with  every  orthonormal  basis  for  E^  could  be  expressed  in  terms  of  some 
wavelet  according  to  (5.23).  This  issue  remains  unresolved.  In  aity  case,  regardless 
of  whether  the  collection  of  inner  products  we  construct  is  exhaustive  or  not,  they  at 
least  constitute  an  highly  convenient  and  practical  collection  from  which  to  choose  in 
any  given  application  involving  the  use  of  homogeneous  signals. 


5.2  Power-Dominated  Homogeneous  Signals 

Energy-dominated  homogeneous  signals  have  infinite  energy.  In  fact,  most  have  in¬ 
finite  power  as  well.  However,  there  are  other  infinite  power  homogeneous  signals 
that  are  not  energy-dominated.  In  this  section,  we  consider  a  more  general  class  of 
infinite-power  homogeneous  signals  that  will  be  of  interest  in  Chapter  6.  We  begin 
with  what  is  a  natural  definition. 

Definition  5,5  A  wide-sense  homogeneous  signal  x{t)  is  said  to  be  power- dominated 
if  when  x{t)  is  filtered  by  an  ideal  bandpass  filter  with  frequency  response  (5.3)  the 
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resulting  signal  Xo(t)  has  finite  power,  i.e., 


1 

lim  -r;:;::  /  xlit)  dt  <  OO. 

T-OO  2T  J-T  “ 

We  use  the  notation  to  designate  the  class  of  power-dominated  homogeneous 
signals  of  degree  H.  Note  that  while  our  definition  necessarily  includes  the  energy- 
dominated  signals,  which  have  zero  power,  insofar  as  our  discussion  is  concerned  they 
constitute  a  degenerate  case. 

Before  proceeding,  we  recall  some  basic  definitions  and  useful  relationships  for 
deterministic  finite-power  signals.  Reference  [68]  contains  a  more  thorough  exposition 
of  these  results.  A  finite-power  signal  f{t)  has  a  (deterministic)  autocorrelation 

Rf{T)  =  ^  f{t)  fit  -  r)  dt 

whose  Fourier  transform  is  the  (deterministic)  power  spectrum 

/OO 

Rfir)  dr 

-OO 

2 

=  £./«)«■'“'*  • 

Analogous  formulae  are  obtained  for  finite  power  sequences.  The  deterministic  au¬ 
tocorrelation  and  poAver  spectrum  satisfy  many  of  the  LTI  filtering  and  sampling 
properties  satisfied  by  the  corresponding  quantities  for  stationary  random  processes. 

Analogous  to  the  energy-dominated  case,  we  can  establish  the  following  theorem 
describing  the  spectral  properties  of  power-dominated  homogeneous  signals. 

Theorem  5.6  When  a  power-dominated  homogeneous  signal  a:(t)  is  filtered  by  an 
ideal  bandpass  filter  with  frequency  response  (5.4),  the  resulting  signal  y{t)  has  finite 
power  and  a  power  spectrum  of  the  form 


Sy{u) 


Sxi^)  1^1  S.  <^U 

0  otherwise 


(5.25) 
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where  Sx{ijj)  is  some  function  that  is  independent  ofui  anduu  and  has  octave-spaced 
ripple,  i.e.,  for  all  integers  k, 


(5.26) 


The  details  of  the  proof  of  this  theorem  are  contained  in  Appendix  D.3,  although 
it  is  identical  in  style  to  the  proof  of  its  counterpart,  Theorem  5.2.  Note  that  since 
in  this  theorem  does  not  depend  on  u)l  or  uu,  this  function  may  be  interpreted 
as  the  generalized  power  spectrum  of  x{t).  Furthermore,  the  relation  (5.26)  implies 
that  signals  in  have  a  generalized  time-averaged  power  spectrum  that  is  1  //-like, 
i.e. 

~  w 

where  7  is  as  defined  in  (5.7).  However,  we  again  emphasize  that  we  reserve  the 
term  “1//  process”  or  “1//  signal”  for  the  statistically  self-similar  random  processes 
defined  in  Chapter  3.  Note  too  that  there  is  no  notion  of  time-averaging  in  the 
spectrum  defining  a  1//  process. 

In  turn,  Theorem  5.6  directly  implies  that  a  homogeneous  signal  x(t)  is  power- 
dominated  if  and  only  if  its  generating  sequence  5(71]  in  the  ideal  bandpass  wavelet 
basis  has  finite  power,  i.e., 


lim 

L-*oo 


1 

2L4- 1 


f{n]  <  00. 


Similarly  we  can  readily  deduce  from  the  results  of  Section  5.1  that,  in  fact,  for  any 
orthonormal  wavelet  basis  with  sufficient  vanishing  moments  R  that  0  <  7  <  27? 
(where  7  =  2H  -hi),  the  generating  sequence  for  a  homogeneous  signal  of  degree 
H  in  that  basis  has  finite  power  if  and  only  if  the  signal  is  power-dominated.  This 
implies  that,  for  such  bases,  when  we  use  (5.22a)  to  synthesize  a  homogeneous  signal 
x{t)  using  an  arbitrary  finite  power  sequence  g[n],  we  are  assured  that  x{t)  €  . 

Likewise,  when  we  use  (5.22b)  to  analyze  any  signal  x{t)  G  P^,  we  are  assured  that 
^[77]  has  finite  power.  Again  we  emphasize  that  different  choices  of  wavelet  basis  in 
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such  an  analysis  will  yield  different  generating  sequences,  all  of  which,  however,  have 
finite  power. 

Energy-dominated  homogeneous  signals  of  arbitrary  degree  H  can  be  highly  reg¬ 
ular,  at  least  away  from  t  =  0.  In  contrast,  power-dominated  homogeneous  signals 
typically  have  a  fractal  structure  similar  to  1//  processes  of  corresponding  degree  H. 
In  fact,  it  is  a  reasonable  conjecture  that  the  Hausdorff-Besicovitch  dimension,  when 
defined,  is  identical  for  the  two  types  of  signals.  Indeed,  despite  their  obvious  struc¬ 
tural  differences,  power-dominated  homogeneous  signals  and  1  //  processes  “look”  re¬ 
markably  similar  in  a  qualitative  sense.  This  is  apparent  in  Fig.  5-2,  where  we  depict 
the  sample  path  of  a  1//  process  along  side  a  randomly-generated  power-dominated 
homogeneous  signal  of  the  same  degree.  We  reiterate  that  in  Fig.  5-2(a),  the  self¬ 
similarity  of  the  1//  process  is  statistical,  while  in  Fig.  5-2(b),  the  self- similarity  of 
the  homogeneous  signal  is  deterministic. 

We  can  quantify  the  similarity  between  the  two  types  of  signals  through  an  obser¬ 
vation  about  their  spectra.  In  general,  we  remarked  that  for  a  given  H,  both  exhibit 
power  law  spectral  relationships  with  the  same  parameter  7.  The  following  theo¬ 
rem  further  substantiates  this  for  the  case  of  randomly-generated  power-dominated 
homogeneous  signals.  The  details  of  the  proof  are  contained  in  Appendix  D.4. 

Theorem  5.7  For  any  orthonormal  wavelet  basis  in  which  V'(t)  has  Rth  order  regu¬ 
larity  for  some  R>  I,  the  random  process  a;(t)  synthesized  according  to 

^(«)  =  EE/’””‘'“9WCW.  (5.27) 

m  n 

using  a  correlation- ergodic  (e.g.,  Gaussian),  zero-mean,  stationary  white  random  se¬ 
quence  ^[n]  of  variance  a^,  has  a  generalized  time-averaged  power  spectrum  of  the 
form 

S^{lo)  =  2-^"*|^(2-"'u;)|2.  (5.28) 

m 

Note  that  the  time-averaged  spectrum  (5.28)  is  identical  to  the  time-averaged 
spectrum  (3.36)  for  the  wavelet-based  synthesis  of  1//  processes  described  in  Sec¬ 
tion  3.2.2.  However,  we  must  be  careful  not  to  misinterpret  this  result.  It  does  not 
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0  200  400  600  800  1000 

(a)  A  sample  function  of  a  1//  process. 


0  200  400  600  800  1000 

(b)  A  randomly-generated  power-dominated  homogeneous  signal. 


Figure  5-2:  Comparison  between  the  sample  path  of  a  1/f  process  and  a  power 
dominated  homogeneous  signal.  Both  correspond  to  j  =  1  (i.e.,  H  =  0). 
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suggest  that  (5.27)  is  a  reasonable  approach  for  synthesizing  1//  processes.  Indeed, 
it  would  constitute  a  very  poor  model  for  1  //-type  behavior  based  on  the  analysis 
results  of  Section  3.2.2;  when  1//  processes  are  decomposed  into  wavelet  bases  we  get 
statistical  rather  than  deterministic  similarity  from  scale  to  scale.  Instead,  the  theo¬ 
rem  remarks  that  the  time-averaged  second  order  statistics  of  the  two  types  of  signals 
are  the  same.  Consequently,  one  would  anticipate  that  distinguishing  1//  processes 
from  power-dominated  homogeneous  signals  based  on  spectral  analysis  alone  would 
be  rather  difficult.  Nevertheless,  the  tremendous  structural  differences  between  the 
two  means  that  they  may  be  readily  distinguished  using  other  techniques  such  as,  for 
example,  wavelet-based  analysis. 

Equation  (5.28)  corresponds  to  the  superposition  of  the  spectra  associated  with 
each  scale  or  octave- band  in  the  wavelet-based  synthesis.  In  general,  we  would  expect 
the  spectrum  of  x{t)  to  be  the  superposition  of  the  spectra  of  the  individual  channels 
together  with  their  cross-spectra.  However,  the  time-averaged  cross-spectra  in  this 
scenario  are  zero,  which  is  a  consequence  of  the  fact  that  the  white  sequence  5(71]  is 
modulated  at  different  rates  in  each  channel.  Indeed,  the  time-averaged  correlation 
is  zero  between  g[7i]  and  q[2’^n]  for  any  m  >  1;  that  is,  white  noise  is  uncorrelated 
with  dilated  and  compressed  versions  of  itself. 

Because  (5.28)  and  (3.36)  are  identical,  we  can  use  Theorem  3.4  to  conclude  that 
the  spectra  of  a  class  of  randomly  generated  power-dominated  homogeneous  signals 
are  bounded  on  any  finite  interval  of  the  frequency  axis  that  does  not  include  C4;  =  0. 
However,  there  are  many  x{t)  €  whose  spectra  is  not  bounded  in  this  manner.  .4n 
interesting  and  potentially  important  subclass  of  power-dominated  homogeneous  sig¬ 
nals  with  such  unbounded  spectra  are  those  for  which  x{t)  as  defined  in  Definition  5.5 
is  periodic.  This  class  of  power-dominated  homogeneous  signals  will  be  referred  to 
as  periodicity-dominated.  It  is  straightforward  to  establish  that  these  homogeneous 
signals  have  the  property  that  when  passed  through  an  arbitrary  bandpass  filter  of 
the  form  (5.4)  the  output  is  periodic  as  well.  These  processes  have  a  power  spectrum 
consisting  of  impulses  (spikes)  whose  areas  decay  according  to  a  l/|a)|^  relationship. 
An  important  class  of  periodicity-dominated  homogeneous  signals  can  be  generated 
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through  a  wavelet-based  synthesis  of  the  form  (5.22)  in  which  the  generating  sequence 
5[7i]  is  periodic. 

5.3  Discrete-Time  Algorithms  for  Processing  Ho¬ 
mogeneous  Signals 

Orthonormal  wavelet  representations  provide  some  useful  insights  into  homogeneous 
signals  x{t).  For  instance,  because  g[n]  is  replicated  at  each  scale  in  the  representation 
(5.20),  the  detail  signals  which  represent  g[n]  modulated  into  a  particular 

octave-band,  are  simply  amplitude-scaled  and  time-dilated  or  compressed  versions  of 
one  another.  The  corresponding  time-frequency  portrait  of  a  homogeneous  signal  (of 
degree  H  =  —1/2,  for  convenience)  is  depicted  in  Fig.  5-3,  from  which  the  scaling 
properties  are  apparent.  We  emphasize  again  that  the  partitioning  in  such  time- 
frequency  portraits  is  idealized.  In  general,  there  is  both  spectral  and  temporal 
overlap  between  cells. 

Wavelet  representations  also  lead  to  some  highly  efficient  algorithms  for  synthe¬ 
sizing,  analyzing,  and  processing  homogeneous  signals,  just  as  they  did  in  the  case 
of  1//  processes.  The  signal  processing  structures  we  develop  in  this  section  are  a 
consequence  of  applying  the  DWT  algorithm  to  the  highly  structured  form  of  the 
wavelet  coefficients  of  homogeneous  signals. 

We  have  already  encountered  one  discrete-time  representation  for  a  homogeneous 
signal  x(t),  namely  that  in  terms  of  a  generating  sequence  q[n\ 

x{t)  < — >  ^[n]  (5.29) 

which  corresponds  to  the  coefficients  of  the  expansion  of  x{t)  in  an  orthonormal  basis 
for  E^.  When  the  (t)  are  derived  from  a  wavelet  basis,  a  second  discrete¬ 
time  representation  for  x(t)  is  available,  which  we  now  discuss. 

Consider  the  coefficients  characterizing  the  resolution-2"*  approximation  of  a 
homogeneous  signal  x{t)  with  respect  to  a  particular  multiresolution  signal  analysis. 
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If  (pit)  is  the  scaling  function  associated  with  this  analysis,  then  these  coefficients  are 
defined  through  the  projections  (2.13),  viz., 

/OO 

x{t)<P::{t)dt. 

•OO 

Using  the  self-similarity  of  x{t)  it  is  straightforw^ard  to  show  that  these  coefficients 
are  identical  at  all  scales  to  within  a  scale  factor,  i.e., 

a”  =  (5.30) 

Consequently,  the  characteristic  sequence  associated  with  x(t),  defined  via 

p[n]  =  a°  (5.31) 

is  an  alternative  discrete-time  characterization  of  x^t),  since  knowledge  of  p[n]  is 
sufficient  to  reconstruct  x(t)  to  arbitrary  accuracy.  We  stress  that  the  characteristic 
sequence  associated  with  x{t)  is  not  unique:  distinct  multiresolution  signal  analyses 
generally  yield  different  characteristic  sequences  for  any  given  homogenous  signal. 
Furthermore,  we  shall  require  the  wavelet  associated  with  any  multiresolution  analysis 
we  consider  to  have  sufficient  vanishing  moments  that  it  meets  the  conditions  of 
Theorem  5.3. 

The  characteristic  sequence  p[n]  is  associated  with  a  resolution-limited  approxi¬ 
mation  to  a  homogeneous  signal  x{t).  Specifically,  p[?r]  represents  unit-rate  samples  of 
the  output  of  a  roughly  lowpass  filter  with  frequency  response  $*(a;)  driven  by  x{t). 
Because  frequencies  in  the  neighborhood  of  the  spectral  origin,  where  there  is  a  pre¬ 
ponderance  of  energy,  are  passed  by  such  a  filter,  p[n]  will  often  have  infinite  energy 
or,  worse,  infinite  power,  even  when  q[n]  has  finite  energy  (a;(t)  €  E^).  Still  more 
severe,  there  will  be  cases  in  which  p[n\  is  actually  singular  and,  hence,  cannot  be 
“plotted.”  In  fact,  the  characteristic  sequence  p\n]  is  typically  a  generalized  sequence 
in  the  same  sense  that  its  corresponding  homogeneous  signal  x(t)  is  a  generalized 
function. 
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p[n] 


p[n] 


Figure  5-4:  The  discrete-time  self-similarity  identity  for  a  characteristic  sequence  p[n] . 

The  characteristic  sequence  can  be  viewed  as  a  discrete-time  homogeneous  signal. 
As  such  it  satisfies  the  discrete-time  self-similarity  relation^ 

P^^^p[n]  =  ^2  ~  2n]p[fc]  (5.32) 

k 

which  is  readily  obtained  using  (5.30)  with  (5.31)  in  the  DWT  analysis  equation 
(2.21a).  Indeed,  (5.32)  is  a  statement  that  when  p[n]  is  lowpass  filtered  with  the 
conjugate  filter  whose  unit-sample  response  is  h[— n]  and  then  downsampled,  we  get 
back  an  amplitude-scaled  version  of  p[n],  as  depicted  in  Fig.  5-4.  And  although 
characteristic  sequences  are  generalized  sequences,  when  highpass  filtered  Avith  the 
corresponding  conjugate  highpass  filter  whose  unit-sample  response  is  g[— n],  the  out¬ 
put  is  a  finite  energy  or  finite  power  sequence,  depending  on  Avhether  p[7i]  corresponds 
to  a  homogeneous  signal  x{t)  that  is  energy-dominated  or  poA^^er-dominated,  respec- 
tiA'ely.  Consequently,  we  can  analogously  classify  p[n]  as  energy-dominated  in  the 
former  case,  and  poAver-dominated  in  the  latter  case.  In  fact,  Avhen  the  output  of  the 
highpass  filter  is  doAvnsampled  at  rate  two,  we  recover  the  characteristic  sequence  q[n] 
associated  with  the  decomposition  of  a:(t)  into  the  corresponding  waA^elet  basis,  i.e., 

/?i/2g[„]  _  ^  _  2n]p[k].  (5.33) 

k 

This  can  be  readily  verified  by  substituting  (5.18)  Avith  (5.19),  and  (5.31)  Avith  (5.30), 
into  the  DWT  analysis  equation  (2.21b). 

From  a  different  perspective,  (5.33)  provides  a  convenient  mechanism  for  obtaining 
the  representation  for  a  homogeneous  signal  x{t)  in  terms  of  its  generating  sequence 

^Relations  of  this  type  may  be  considered  discrete- time  counterparts  of  the  dilation  equations 
considered  by  Strang  in  [16]. 
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q[n]  from  one  in  terms  of  its  corresponding  characteristic  sequence  p[n],  ie., 

p[n]  — >  g[n]. 


To  obtain  the  reverse  mapping 


9[n]  — ^  p[n] 

is  less  straightforward.  For  an  arbitrary  sequence  g[n],  the  associated  characteristic 
sequence  p[n]  is  the  solution  to  the  linear  equation 

-  2k]p[k\  =  Y^g[n-  2^-]9[A;],  (5.34) 

k  k 

as  can  be  verified  by  specializing  the  DWT  synthesis  equation  (2.21c)  to  the  case  of 
homogeneous  signals.  There  appears  to  be  no  direct  method  for  solving  this  equation; 
even  in  the  frequency  domain,  where,  for  finite  energy  q[n],  (5.34)  becomes 

=  H{u)  P{2uj)  +  G(u;)  Qi2uj) 

with  P{u})  the  generalized  Fourier  transform  ofp[n],  the  problem  appears  intractable. 
However,  the  DWT  synthesis  pyramid  of  Fig.  2-6(b)  suggests  a  convenient  and  effi¬ 
cient  iterative  algorithm  for  constructing  p[n]  from  ^[yi].  In  particular,  denoting  the 
estimate  of  p[n]  on  the  fth  iteration  by  the  algorithm  is 

=  0  (5.35a) 

^  |h[n  -  2A:]p^*^[fc]  -f  g[n  -  2fc]9[fc] | .  (5.35b) 

k 

This  recursive  upsample-filter-merge  algorithm,  depicted  in  Fig.  5-5,  can  be  inter¬ 
preted  as  repeatedly  modulating  9[n|  with  the  appropriate  gain  into  successively 
lower  octave  bands  of  the  frequency  interval  0  <  |a?|  <  tt,  and  we  note  that  the 
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Figure  5-5:  Iterative  algorithm  for  the  synthesis  of  the  characteristic  >equence  p[n] 
of  a  homogeneous  signal  x{t)  from  its  generating  sequence  q[n].  The  notation  pl'![n] 
denotes  the  value  ofp[n]  at  the  ith  iteration. 

precomputable  quantity 

g.[n]  =  J29[n-2k]q[k] 
k 

represents  the  q[n]  modulated  into  essentially  the  upper  half  band  of  frequencies. 

The  modulation-based  interpretation  of  the  algorithm  suggests  that  issues  of  con¬ 
vergence  as  J  >  oo  are  ultimately  tied  to  issues  of  whether  or  not  p[n]  is  a  regular 
sequence.  In  particular,  when  p[n]  is  well-behaved,  the  algorithm  exhibits  rapid  con¬ 
vergence.  The  Fourier  transform  of  the  error  between  iterations  is  readily  derived 
as 

r«-i 

G(Tu})Q{T^^uj) 

.r-=o 

=  (2/?p/?i/2$(2'a;)G(2‘w)Q(2'+ia;)/$(u;) 

for  [cc’l  <  TT,  where  the  last  inequality  follows  from  the  repeated  application  of  (A. 2a). 
In  turn,  using  Parseval’s  relation,  the  total  energy  difference  between  iterates  can  be 
obtained  as 

However,  it  is  important  to  keep  in  mind  that  any  practical  application  of  homo¬ 
geneous  signals  will  ultimately  only  involve  scaling  behavior  over  a  finite  range  of 


153 


scales,  corresponding  to  g[n]  modulated  into  a  finite  range  of  adjacent  octave  bands. 
Consequently,  only  a  finite  number  of  iterations  would  be  used  with  the  algorithm 
(5.35).  More  generally,  this  also  means  that  many  of  the  theoretical  issues  associated 
with  homogeneous  signals  concerning  singularities  and  convergence  do  not  present 
practical  difficulties  in  the  application  of  deterministically  self-similar  signals,  as  will 
be  apparent  in  our  developments  of  the  next  chapter. 

In  our  closing  remarks,  we  mention  that  there  would  appear  to  be  important 
connections  to  be  explored  between  the  self-similar  signal  theory  described  here  and 
the  work  of  Barnsley,  et  ai,  [69]  on  deterministically  self- affine  one-dimensional  and 
multi-dimensional  signals.  Interestingly,  the  recent  work  of  Malassenet  and  Mersereau 
[70]  has  shown  that  these  iterated  function  systems  have  efficient  representations  in 
terms  of  wavelet  bases  as  well. 
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Chapter  6 


Fractal  Modulation 


There  are  a  number  of  interesting  potential  applications  for  the  wide-sense  homo¬ 
geneous  signal  theory  developed  in  the  last  chapter.  In  this  chapter,  we  consider  a 
specific  one  as  an  indication  of  the  direction  that  some  applications  may  take.  In 
particular,  we  explore  the  use  of  homogeneous  signals  as  a  modulating  Avaveforms  in 
a  communication-based  context.  We  begin  by  considering  an  idealized  but  general 
channel  model,  and  proceed  to  demonstrate  that  the  use  of  homogeneous  waveforms 
in  such  channels  is  at  least  natural,  if  not  optimal,  and  leads  to  a  novel  multirate 
modulation  strategy  in  which  data  is  transmitted  simultaneously  at  multiple  rates. 
We  remark  at  the  outset  that  while  multirate  modulation  has  received  attention  in 
the  recent  literature  (see,  e.g.,  [14]  [71]  [72]  [73]),  the  work  is,  at  best,  peripherally 
related. 

Consider  the  problem  of  designing  a  communication  system  for  the  transmission  of 
continuous-  or  discrete- valued  data  sequences  over  a  noisy  and  unreliable  continuous- 
amplitude,  continuous-time  channel.  As  depicted  in  Fig.  6-1,  the  classical  structure 
for  such  a  system  involves  a  modulator  at  the  transmitter  that  embeds  the  data 
sequence  q[n\  into  a  signal  a:(t)  which  is  sent  over  the  channel.  At  the  receiver,  a 
demodulator  processes  the  distorted  signal  r{t)  to  extract  an  optimal  estimate  of  the 
data  sequence  q[n]. 

In  a  typical  scenario,  the  channel  would  be  “open”  for  some  time-interval  T, 
during  which  it  has  a  particular  bandwidth  W  and  signal-to-noise  ratio  (SNR).  In 
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Figure  6-1:  A  communication  system  for  transmitting  a  continuous-  or  discrete- 
amplitude  data  sequence  ^[n]  over  a  noisy  and  unreliable  continuous-amplitude, 
continuous-time  channel. 


z(t) 


Figure  6-2:  The  channel  model  for  a  typical  communications  scenario. 

general,  this  rather  generic  model  can  be  used  to  capture  both  characteristics  of  the 
transmission  medium  and  constraints  inherent  in  one  or  more  receivers.  When  the 
noise  characteristics  are  additive,  the  overall  channel  model  is  as  depicted  in  Fig.  6-2, 
where  z(t)  represents  the  noise  process. 

When  either  the  bandwidth  or  duration  parameters  of  the  channel  are  known  a 
priori,  there  are  many  well-established  approaches  for  designing  an  optimum  com¬ 
munication  system  for  transmitting  q[n]  reliably.  However,  there  are  a  variety  of 
military  and  commercial  communication  contexts  in  which  both  the  bandwidth  and 
duration  parameters  are  either  unknown  or  at  least  inaccessible  to  the  transmitter^ 
This  case,  by  contrast,  has  received  comparatively  less  attention  in  the  communica¬ 
tions  literature,  although  it  encompasses  a  range  of  both  point-to-point  and  broadcast 
communication  scenarios,  including 

-  channels  subject  to  hostile  jamming 

'Equivalently,  we  might  be  interested  in  designing  sj'stems  whose  performance  is  optimal  over  a 
range  of  possible  bandwidth  and  duration  parameter  combinations. 
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-  broadcast  communication  involving  disparate  receivers 


-  multiple  access  channels  involving  time-,  frequency-,  or  code-division  multiplex¬ 
ing  (ie.,  TDMA,  FDMA,  or  CDMA) 

-  packet-switching 

-  fading  channels,  such  as  meteor  burst,  ocean  acoustic,  or  mobile  radio  channels 

-  low  probability  of  intercept  (LPI)  communication 

In  these  situations,  it  is  often  desirable  to  have  a  system  with  the  following  per¬ 
formance  characteristics: 

1.  Given  a  duration- bandwidth  product  T  x  W  that  exceeds  some  threshold,  we 
must  be  able  to  transmit  g[n]  without  error  in  the  absence  of  noise  {z{t)  =  0). 

2.  Given  increasing  duration-bandwidth  product  in  excess  of  this  threshold,  we 
must  be  able  to  transmit  q[n]  with  increasing  fidelity  in  the  presence  of  noise. 
Furthermore,  in  the  limit  of  infinite  duration-bandwidth  product,  perfect  trans¬ 
mission  should  be  achievable  at  any  finite  SNR. 

Note  that  the  first  of  these  requirements  implies  we  ought  to  be  able,  at  least  in 
principle,  to  recover  q[n]  from  arbitrarily  little  bandwidth  given  sufficient  duration, 
or,  alternatively,  from  arbitrarily  little  duration  given  sufficient  bandwidth.  The 
second  requirement  implies  that  we  ought  to  be  able  to  obtain  better  estimates  of 
q[n]  the  longer  the  receiver  is  able  to  listen,  or  the  greater  the  bandwidth  it  has 
available.  Consequently,  the  modulation  must  contain  redundancy  to  exploit  for 
this  error  correction  capability.  As  we  shall  see,  the  use  of  homogeneous  signals  for 
transmission  is  naturally  suited  to  fulfilling  these  requirements. 

In  general,  we  would  anticipate  that  the  time-bandwidth  threshold  involved  would 
be  a  function  of  the  length  L  of  the  data  sequence.  For  this  reason,  it  is  more 
convenient  to  phrase  the  discussion  in  terms  of  a  rate- bandwidth  ratio  RfW.  Indeed, 
when  the  duration  constraint  T  is  transformed  into  a  symbol  rate  constraint  R  defined 
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R  =  LIT.  (6.1) 

our  time-bandwidth  threshold  is  transformed  into  a  rate-bandwidth  ratio  R/W  thresh¬ 
old  that  is  independent  of  sequence  length. 

In  a  conventional  communication  system  in  which  the  bandwidth  is  prescribed, 
the  information-bearing  sequence  ^[n]  is  modulated  into  the  available  bandwidth  W 
at  some  rate  R  such  that  in  the  absence  of  noise,  perfect  recovery  of  ^[n]  is  possible. 
The  spectral  efficiency  ij  of  such  a  system  is  usually  defined  as  the  maximum  rate  R 
achievable  for  a  given  bandwidth  W ^  i.e., 


77  =  max- 


which  is  measured  in  symbols/sec/Hz.  More  eificient  systems  can  achieve  a  higher 
rate  for  a  given  bandwidth,  or,  equivalently,  support  a  given  rate  with  less  bandwidth. 

A  reasonable  approach  to  spectrally  efficient  transmission  can  be  constructed  as 
follows.  A  transmitter  modulates  the  data  sequence  g'[?r]  onto  a  lowpass  waveform  b}' 
expanding  it  into  an  orthonormal  basis  to  generate  the  lowpass  transmitted  waveform 


n]  ^/^sinc  {Rot  —  n) 


where 

1  t  =  0 

sinc(t)  =  <  sinTrt  ’ 

-  otherwise 

Trt 

and  where  Rq  is  a  fixed  parameter  of  the  system.  In  turn,  the  receiver  recovers  (/[7r] 
from  the  projections 


q[n\  =  f  x{t)  yf^smc{Rot  —  n)  dt 
J  —  OO 
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which  are  implemented  as  a  sequence  of  filter-and-sample  operations.  Since 

l/\/^ 

0  otherwise 

this  system  achieves  a  rate  oi  R  =  Rq  symbols/sec  using  a  (double-sided)  bandwidth 
of  W  =  Ho  Hz.  Thus,  the  system  is  characterized  by  a  spectral  efficiency  of 

77o  =  1  symbol/sec/Hz.  (6.3) 

Furthermore,  it  is  power  efficient  in  the  presence  of  additive  stationary  white  noise. 
Needless  to  say,  however,  such  a  system  is  not  practical.  Indeed,  the  lack  of  temporal 
localization  in  sinct  makes  the  system  not  only  unrealizable,  but  have  infinite  delay  as 
well.  Nevertheless,  useful  approximations  can  be  implemented  using  other  bases  with 
better  time  localization  properties  and  with  little  penalty  in  spectral  characteristics. 
Consequently,  the  idealized  system  is  a  useful  baseline  for  the  purposes  of  compar¬ 
ison,  and  (6.3)  is  the  corresponding  performance  benchmark.  We  shall  refer  to  this 
modulation  as  uniformly  most  spectrally  efficient  (UMSE).  Indeed,  it  represents  the 
potential  performance  of  a  system  in  which  the  transmitter  has  perfect  knowledge  of 
the  rate-bandwidth  characteristics  of  the  channel.  Since  under  our  requirements  the 
transmitter  is  constrained  to  have  no  knowledge  of  these  parameters,  and,  hence,  can¬ 
not  reconfigure  itself  accordingly,  UMSE  modulation  provides  a  useful  performance 
bound  in  our  analysis. 

Note  that  choosing  a  different  value  of  Rq  effects  a  time-scaling  of  the  system. 
The  new  system  has  the  same  efficiency  but  attains  it  for  a  different  combination  of 
rate  and  bandwidth.  However,  any  one  such  system  is  only  efficient  when  operating 
at  the  rate  and  bandwidth  for  which  it  was  designed.  That  is,  given  a  doubling  of  the 
available  bandwidth,  the  system  will  still  function  properly,  though  it  will  not  be  able 
to  exploit  the  potential  to  double  the  symbol  rate  afforded  by  the  excess  bandwidth. 
To  do  this  would  require  a  different,  re-scaled  system.  Because  this  involves  a  re¬ 
designed  transmitter,  this  change  requires  that  the  transmitter  have  knowledge  of  the 
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bandwidth  available. 

We  now  turn  our  attention  to  the  design  of  a  transmission  scheme  based  upon 
the  concept  of  embedding  the  data  to  be  transmitted  into  a  wide-sense  homogeneous 
signal.  As  we  shall  see,  this  leads  to  a  systems  that  maintains  its  spectral  eflficiency 
over  a  broad  range  of  rate-bandwidth  combinations  using  a  fixed  transmitter  con¬ 
figuration.  Due  to  the  fractal  properties  of  the  transmitted  signals,  we  refer  to  the 
resulting  scheme  as  fractal  modulation. 


6.1  Transmitter  Design:  Modulation 

Consider  a  transmitter  that  expands  §[71]  in  an  orthonormal  self-similar  basis  Olf{t) 
of  arbitrary  degree  H 

^(0  =  E  (0 

n 

Avhere  the  basis  is  constructed  from  the  ideal  bandpass  wavelet  xfft)  according  to 
(5.15),  viz., 

m 

When  ^'[71]  has  finite-power,  as  we  shall  generally  assume,  the  resulting  x(t)  is  a  power- 
dominated  homogeneous  signal  and  corresponds  to  a  time-frequency  portrait  of  the 
form  depicted  in  Fig.  5-3.  More  generally,  we  recognize  this  as  a  multirate  modulation 
of  q[n\:  in  the  77rth  channel,  q[n]  is  modulated  at  rate  2™  using  a  (double-sided) 
bandwidth  of  2”*  Hz.  Furthermore,  the  energy  per  symbol  used  in  successively  higher 
channels  {m)  scales  by 

P  =  2''  =  2^^+^ 

It  is  apparent  that  from  such  a  transmission,  a  suitably  designed  receiver  can  recover 
9(71]  at  rate  2"’  using  a  baseband  bandwidth  of  2'"+^  Hz,  from  which  we  deduce  that 
this  modulation  has  a  spectral  efficiency 

Vf  =  (1/2)  symbol/sec/Hz, 
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which  is  half  that  of  our  baseline  system^.  In  effect,  this  loss  in  efficiency  is  the  price 
paid  to  enable  a  receiver  to  use  any  of  an  arbitrarily  broad  range  of  rate-bandwidth 
combinations  to  accommodate  the  characteristics  of  the  channel  or  its  own  internal 
processing  constraints. 

In  Fig.  6-3,  we  illustrate  the  rate-bandwidth  tradeoffs  that  can  be  made  at  the 
receiver  in  demodulating  the  transmitted  data.  In  particular,  in  the  absence  of  noise 
a  receiver  can,  in  principle,  perfectly  recover  the  information  sequence  ^[n]  using  rate- 
bandwidth  combinations  that  lie  on  or  below  the  solid  curve.  Note  that,  according  to 
our  definition,  the  rate  corresponding  to  B  =  1  Hz  is  /?  =  tjf,  i.e.,  the  spectral  effi¬ 
ciency  of  the  modulation  scheme.  The  stepped  character  of  this  curve  reflects  the  fact 
that  only  rates  of  the  form  2"^  can  be  accommodated,  and  that  full  octave  increases  in 
signal  bandwidth  are  required  to  enable  g[n]  to  be  demodulated  at  successively  higher 
rates.  For  reference,  the  performance  bound  corresponding  to  UMSE  modulation  is 
superimposed  on  this  plot  as  a  dashed  line.  We  emphasize  that  with  UMSE  modula¬ 
tion,  the  transmitter  has  perfect  knowledge  of  the  rate-bandwidth  characteristics  of 
the  channel,  while  with  fractal  modulation  the  transmitter  has  no  knowledge. 

Clearly,  this  fractal  modulation  scheme  is  unrealizable  for  several  reasons.  First, 
the  basis  functions  used  at  the  transmitter  again  have  infinite  length.  Hence,  not 
only  can  such  a  transmitter  not  be  implemented,  but  a  corresponding  receiver  would 
involve  infinite  delay.  However,  we  may  more  generally  replace  the  ideal  bandpass 
wavelet  used  to  synthesize  the  orthonormal  self-similar  basis  with  one  having  compa¬ 
rable  frequency  domain  characteristics  but  better  time  localization  properties.  As  we 
have  discussed,  there  are  many  suitable  wavelets  to  choose  from,  among  which  are 
those  due  to  Daubechies  [1]. 

Another  reason  that  fractal  modulation  is  impractical  is  associated  with  the  fact 
that  q[n]  is  modulated  into  an  infinite  number  of  octave-bandwidth  channels.  As 
a  consequence,  such  a  transmitter  requires  infinite  power.  However,  in  a  practical 
implementation,  only  a  finite  collection  of  contiguous  channels  At  would  be  used  by 

^We  emphasize  that  it  is  the  baseband  bandwidth  that  is  important  in  defining  the  spectral 
efficiency  in  accordance  with  our  our  channel  model  of  Fig.  6-2,  since  it  defines  the  highest  frequency 
available  at  the  receiver. 
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Rate  R  (symbols/sec) 


10^ 


10' 


10'' 


Bandwidth  W  (Hz) 


Figure  6-3:  Spectral  efEciency  of  fractal  modulation.  At  each  bandwidth  B,  the  solid 
curve  indicates  the  maximum  rate  at  which  transmitted  data  can  be  recovered.  The 
dashed  curve,  indicating  the  corresponding  rate  for  UMSE  modulation,  represents  a 
performance  bound. 
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the  transmitter.  The  resulting  modulated  waveform 


=  L  (6-4) 

then  exhibits  self-similarity  only  over  a  range  of  scales,  and  the  data  can  be  demodu¬ 
lated  at  one  of  only  a  finite  number  of  rates.  In  terms  of  Fig.  6-3,  the  rate- bandwidth 
characteristic  would  extend  over  a  finite  range  of  bandwidths  which  have  been  chosen 
to  cover  extremes  anticipated  for  the  system. 

It  is  important  to  note  that  the  fractal  modulation  transmitter  can  be  implemented 
in  a  computationally  highly  eflScient  manner.  Indeed,  much  of  the  processing  can  be 
performed  using  discrete-time  algorithms  of  Section  5.3.  Assuming  the  number  of 
scales  to  be  used  is  M,  the  signal  x{t)  defined  via  (6.4)  is  synthesized  from  5[?r]  as 
follows.  First,  the  algorithm  (5.35)  is  iterated  M  times  to  generate  using 

the  QMF  filter  pair  h[n],g[n]  appropriate  to  the  wavelet  basis.  This  modulates  q[n] 
into  a  sequence  of  M  contiguous  discrete-time  channels.  Then  the  resulting  sequence 
is  modulated  into  the  desired  bandwidth  via  the  appropriate  scaling  function 
according  to 

n 

Here  we  have  assumed,  for  convenience,  the  collection  of  scales 

A4  =  {0,1,  ...  M-  1} 

where  M  is  some  positive  integer.  It  is  important  to  point  out  that  because  a  batch- 
iterative  algorithm  is  employed,  potentially  large  amounts  of  data  buffering  may  be 
required.  Hence,  while  the  algorithm  may  be  computationally  efficient,  in  may  be 
considerably  less  so  in  terms  of  storage  requirements.  However,  in  the  event  that 
g[n]  is  finite  length,  it  is  conceivable  that  the  algorithm  may  be  modified  so  as  to  be 
memory-efficient  as  well. 

In  fact,  the  transmission  of  finite  length  sequences  raises  another  issue;  in  trans¬ 
mitting  finite  length  messages,  a  direct  implementation  of  fractal  modulation  is  rather 
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inefficient,  principally  because  the  successively  higher  channels  are  increasingly  un¬ 
derutilized.  In  particular,  as  is  apparent  from  the  time-frequency  portrait  of  Fig.  5-3, 
if  ^[n]  has  finite  length,  e.g., 

g[n]  =  0,  n<0,  n>L— 1 

then  the  mth  channel  will  have  completed  its  transmission  and  go  idle  in  half  the  time 
it  took  the  (m  —  l)st  channel  to  complete  its  transmission,  and  so  forth.  However, 
finite  length  messages  may  be  accommodated  efficiently  by  modulating  their  periodic 
extensions 

q[n]  =  q[n  mod  L], 

thereby  generating  a  transmitted  waveform 

=  Z  ^Wn  (t) 

n 

which  constitutes  a  periodicity-dominated  homogeneous  signal  of  the  type  discussed 
in  Section  5.2.  If  we  let 

q=  {^[0]  ^[1]  q[L-l]} 

denote  the  data  vector,  then  the  time-frequency  portrait  associated  with  this  signal 
is  shown  in  Fig.  6-4.  Note  that  not  only  do  we  retain  the  ability  to  make  various  rate- 
bandwidth  tradeoffs  at  the  receiver  with  this  modification,  but  we  acquire  a  certain 
flexibility  in  our  choice  of  time  origin  as  well.  Specifically,  as  is  apparent  from  Fig.  6- 
4,  the  receiver  need  not  begin  demodulating  the  data  at  t  =  0,  but  may  choose  a 
time-origin  that  is  some  multiple  of  LR  when  operating  at  rate  R.  Additionally,  this 
strategy  can,  in  principle,  be  extended  to  accommodate  blocking  of  the  data. 

Before  considering  the  problem  of  optimum  demodulation,  we  consider  one  final 
aspect  of  transmitter  configuration.  Having  assumed  an  arbitrary  choice  for  the 
parameter  H,  we  now  consider  how  this  parameter  might  be  appropriately  chosen 
for  a  given  operating  scenario.  From  the  point  of  view  of  spectral  efficiency,  it  is 
apparent  that  this  parameter  has  no  effect.  Indeed,  the  rate-bandwidth  tradeoffs 
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Figure  6-4:  A  portion  of  the  time-frequency  portrait  of  the  transmitted  signal  for 
fractal  modulation  of  a  finite-length  data  vector  q.  The  case  H  =  —1/2  is  shown  for 
convenience. 
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available  to  a  receiver  are  independent  of  the  value  of  H.  However,  H  controls  the 
relative  distribution  of  power  between  channels.  As  such  it  affects  the  spectrum  of 
the  fractal  modulation  signal.  As  we  noted  in  our  discussion  of  homogeneous  signals 
in  Chapter  5,  such  power-dominated  homogeneous  signals  have  a  power  spectrum  of 
the  form 

‘5'xH  ~  (6.5) 

where  7  =  2H  d- 1.  Hence,  the  selection  of  H  becomes  important  when  we  consider 
the  presence  of  additive  noise  z{t)  in  the  channel  of  Fig.  6-2. 

For  more  traditional  additive  stationary  Gaussian  noise  channels  in  which,  for 
example,  the  bandwidth  is  known  a  priori,  the  appropriate  spectral  shaping  of  the 
transmitted  signal  is  governed  by  a  “water-filling”  procedure  [14]  [71].  In  fact,  this  is 
the  method  by  which  the  capacity  of  such  channels  is  computed  [74].  Through  this 
procedure,  the  available  signal  power  is  distributed  in  such  a  way  that  proportionally 
more  power  is  located  at  frequencies  where  the  noise  power  is  smaller.  The  graphical 
interpretation  is  one  in  which  the  available  signal  power  is  “poured”  onto  the  noise 
spectrum  within  the  available  bandwidth,  so  that  the  spectral  density  of  the  signal  at 
a  given  frequency  is  given  by  the  distance  from  the  noise  floor  to  the  “water  level.” 

However,  when  there  is  uncertainty  in  the  available  bandwidth,  the  water-filling 
approach  is  a  less  desirable  strategy.  Imagine  a  scenario  in  which  the  noise  power  is 
spectrally  fiat  except  in  some  frequency  band  0  <  u>i  <  U(j  <  00,  where  it  is  zero. 
Then  a  water-filling  procedure  will  locate  the  signal  power  predominant!}'  within  this 
frequency  band.  Consequently,  when  the  channel  bandwidth  is  such  that  it  includes 
this  band,  the  SNR  in  the  channel  will  be  much  higher  than  if  the  bandwidth  did 
not  include  this  band.  Hence,  the  system  performance  would  strongly  depend  on  the 
available  bandwidth.  A  more  reasonable  power  allocation  strategy  for  the  channel  of 
Fig.  6-2,  therefore,  is  to  distribute  power  according  to  a  spectral-matching  rule,  i.e., 
so  as  to  maintain  an  SNR  that  is  independent  of  frequency.  This  leads  to  system 
performance  that  is  uniform  with  variations  in  bandwidth,  as  would  naturally  arise 
out  of  a  min-max  type  performance  criterion.  Note,  too,  that  the  spectral-matching 
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rule  leads  to  a  transmitted  signal  that  is  potentially  well-suited  for  covert  and  LPI 
communication. 

Since  power-dominated  homogeneous  signals  have  a  power  spectrum  of  the  form 
of  eq.  (6.5),  the  spectral-matching  rule  suggests  that  fractal  modulation  may  be  nat¬ 
urally  matched  to  channels  with  additive  l//-type  noise.  As  discussed  in  Chapter  3, 
this  rather  broad  class  of  statistically  self-similar  noises  includes  both  classical  white 
Gaussian  noise  and  Brownian  motion,  as  well  as  some  more  general  and  rather  preva¬ 
lent  nonstationary  processes  characterized  by  strong  long-term  statistical  dependence. 

In  this  section,  we  have  developed  a  strategy  for  embedding  an  information  se¬ 
quence  q[n\  into  a  homogeneous  signal  x{t)  that  satisfies  the  first  of  the  two  system 
requirements  desci'ibed  at  the  outset  of  the  chapter.  Next,  we  turn  our  attention  to 
the  problem  of  designing  receivers  for  fractal  modulation,  such  that  the  second  of  our 
system  requirements  are  satisfied. 

6.2  Receiver  Design:  Demodulation 

In  this  section,  we  consider  the  problem  of  recovering  the  information  sequence  9[7z] 
from  the  observations  r{t)  available  to  the  receiver.  In  general,  r{t)  is  assumed  to  be 
a  band-limited,  time-limited,  and  noisy  version  of  x{t)  consistent  with  our  channel 
model  of  Fig.  6-2,  and  where  the  noise  z{t)  is  Gaussian  l//-t5'^pe  noise.  We  shall  as¬ 
sume  the  degree  H  of  the  transmitted  homogeneous  signal  has  been  chosen  according 
to  our  spectral-matching  rule,  *.e.. 


H  —  ^/signal  —  -^fnoise'  (^•^) 

We  remark  that  if  it  is  necessary  that  the  transmitter  measure  ii/noise  in  order  to  per¬ 
form  this  spectral  matching,  the  robust  and  efficient  parameter  estimation  algorithms 
for  1//  processes  developed  in  Section  4.2  may  be  exploited. 

We  shall  also  assume  in  our  analysis  that  the  ideal  bandpass  wavelet  basis  has 
been  used  to  synthesize  x{t).  While  we  recognize  that  realizable  wavelet  bases  Avill  be 
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used  in  practice,  we  anticipate  that  the  results  we  obtain  will  be  generally  applicable 
to  many  of  the  approximately  bandpass  wavelet  bases,  such  as  those  of  Daubechies. 
Finally,  we  shall  assume  ^[n]  to  be  of  finite  length  L,  and  that  its  periodic  extension 
q[n  mod  L]  is  modulated  as  discussed  in  the  previous  section. 

In  our  performance  analyses  exploring  rate,  bandwidth  and  SNR  tradeoffs,  we  shall 
compare  fractal  modulation  to  the  performance  of  UMSE  modulation  with  repetition¬ 
coding.  That  is,  we  make  comparisons  to  UMSE  modulation  where  redundancy  is 
provided  by  transmitting  each  sample  of  sequence  g[n]  some  number  K  times  in 
succession.  This  simple  redundancy  scheme  provides  for  error  correction  capability 
that  is  comparable  to  that  of  fractal  modulation.  However,  with  this  coding,  UMSE 
modulation  performance  does  not  constitute  a  performance  bound.  Indeed,  it  is  not 
“uniformly  most  power  efficient”  in  two  respects.  For  one,  using  UMSE  modula¬ 
tion  corresponds  to  distributing  signal  power  uniformly  over  the  available  bandwidth, 
which  is  inefficient  except  in  the  presence  of  stationary  white  noise  {H  =  —1/2).  This 
follows  from  the  the  water-filling  rule  for  power  allocation.  Secondly,  even  for  the  case 
of  white  noise,  there  are  much  more  effective  (f.e.,  more  power  efficient)  redundancy 
schemes  available  for  use  with  channels  of  known  bandwidth;  see,  e.g.,  [75].  Never¬ 
theless,  with  these  caveats  in  mind,  such  comparisons  do  lend  some  insight  into  the 
relative  power  efficiency  of  fractal  modulation. 

6.2.1  Minimum  Mean-Square  Error  Demodulation 

In  this  section,  we  assume  that  ^[n]  is  a  continuous- valued  sequence  of  independent, 
identically-distributed,  zero-mean  Gaussian  random  variables,  each  with  variance 

Varg[n]  = 

and  develop  a  receiver  yielding  the  minimum  mean-square  error  (MSE)  estimate  of 
q[n]  based  on  our  corrupted  observations  r{t).  We  begin  by  projecting  our  observa¬ 
tions  onto  the  ideal  bandpass  wavelet  basis  from  which  x{t)  was  synthesized,  so  that 
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our  observations  may  be  expressed  as 


^  m/2  _|. 

where  the  are  the  Gaussian  wavelet  coefficients  of  the  noise  process  z{t).  Consistent 
with  the  wavelet-based  models  of  1//  processes  we  developed  in  Section  3.2.2,  we  may 
reasonably  model  the  z^  as  independent  zero-mean  random  variables  with  variances 

YsiV  z^  =  alp-^  (6.7) 

for  some  variance  parameter  <7^  >  0. 

If  the  channel  is  bandlimited  to  2^^  Hz  for  some  integer  Mu,  this  precludes  access 
to  the  observation  coefficients  at  scales  finer  than  m  =  Mu-  Simultaneously,  if  we 
translate  the  time-limiting  in  the  channel  into  a  constraint  on  the  minimum  allowable 
rate  of  2^'^^  symbols/sec  for  some  integer  Mu,  this  precludes  access  to  the  observation 
coefficients  at  scales  coarser  than  m  =  Ml-  Hence,  the  observation  coefficients  r”’ 
available  at  the  receiver  will  correspond  to  the  range  of  scales 

Ml  <  m  <  Mu- 

We  have  assumed  Mu  >  Ml,  which  corresponds  to  the  case  in  which  we  have  at  least 
enough  time-bandwidth  product  to  recover  ^[n]  in  the  absence  of  noise. 

Consistent  with  both  our  rate  constraint  and  the  fact  that  Ave  modulate  the  peri¬ 
odic  extension  of  ^[n],  we  have  available 


K  =  -  1  (6.8) 

measurements  of  each  of  the  L  non-zero  samples  of  the  sequence  g[n]  from  which  to 
compute  an  optimal  estimate.  This  can  be  inferred  from  Fig.  6-4.  In  particular,  at 
scale  m  we  have  copies  of  g[7i],  so  that  there  is  a  doubling  of  the  number  of 

available  copies  at  successive  scales  leading  to  the  geometric  sum  (6.8).  The  specific 
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relationship  between  rate,  bandwidth  and  K  in  terms  of  the  spectral  efficienc}'^  of 
fractal  modulation  can,  therefore,  be  expressed  as 


—  _  277f 
W  ~  K  +  l' 


(6.9) 


where  we  note  that  the  case  K  =  1,  for  which  Mu  =  Mi  and  a  single  copy  of  q[n] 
is  available  at  the  receiver,  corresponds  to  the  minimum  time-bandwidth  product  for 
which  reconstruction  is  possible.  In  the  case  of  the  ideal  bandpass  wavelet,  we  estab¬ 
lished  earlier  that  T}p  =  1/2.  Using  more  general  wavelets,  the  spectral  bandwidth 
and  hence  efficiency  of  fractal  modulation  is  less  well-defined.  Nevertheless,  for  any 
reasonable  definition  of  bandwidth,  practical  wavelets  may  be  chosen  for  use  with 
fractal  modulation  that  yield  a  spectral  efficiency  close  to  1/2.  We  shall  therefore 
assume  in  general  that  ?7f  ~  1/2  in  our  subsequent  discussion. 

It  is  a  straightforward  to  derive  minimum  MSE  estimates  of  g[n]  for  each  n  from 
the  collection  of  independent,  nois}^  observations 


r  =  {r^,m  G  M,n  e  Af{m)} 


wffiere 


M  =  {Ml, Ml +  1,..., Mu} 
N{m)  =  {0,1,...,L2'”-^^^  -  1}. 


(6.10) 

(6.11) 


The  optimum  estimator  is  given  by 


q[n]  =  .E[g[n]|r]  = 


1  Mu 

K  +  (l/a^)  ^  ^  ^ 

i=0 


(6.12) 


where  cr^  is  the  SNR  in  the  channel,  defined  by 


r2  _ 
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In  general  the  optimum  estimate  represents  a  blend  of  a  priori  information  about 
§[77],  and  information  obtained  from  the  observations.  At  high  SNR  ),  the 

a  priori  information  is  essentially  ignored,  and  the  resulting  estimator  specializes  to 
the  maximum  likelihood  estimator.  At  low  SNR  (<t^  ■C  l/AT),  the  observations  are 
essentially  ignored,  and  the  estimator  approaches  the  o  priori  estimate 


9[n]  =  E  [9(77]]  =  0. 

We  note  that,  as  in  the  case  of  the  transmitter,  the  receiver  has  a  convenient, 
computationally-efficient,  hierarchical  structure  based  on  the  DWT.  Assuming  r{t) 
is  bandlimited  to  resolution  2^^ ,  it  may  be  sampled  at  rate  2^'^^ ,  then  successively 
filtered  and  downsampled  to  level  m  =  Ml  according  to  the  wavelet  decomposition 
tree  of  Fig.  2-6(a).  To  produce  the  desired  estimate,  at  each  level  m,  the  terms 
from  the  detail  sequence  corresponding  to  the  same  value  of  the  5(77]  are  collected 
together,  weighted  by 

Q-mjl 

■kTTM’ 

and  accumulated  with  the  weighted  r™  from  previous  stages.  Again,  however,  this  is 
a  batch  algorithm,  and  while  computationally  efficient,  may  not  be  efficient  in  terms 
of  storage  requirements. 

Finally,  we  remark  that  the  receiver  (6.12)  we  have  designed  is  a  linear  data 
processor,  as  would  be  anticipated  since  we  have  restricted  the  discussion  to  Gaussian 
sequences  and  Gaussian  noise.  In  non-Gaussian  scenarios,  the  receivers  we  have 
developed  are  the  best  linear  data  processors,  i.e.,  no  other  linear  data  processor  is 
capable  of  generating  an  estimate  of  ^[77]  with  a  smaller  mean-square  error. 


Performance 

The  normalized  MSE  associated  with  the  optimum  receiver  (6.12)  can  be  readily 


derived  as 


2  _  E  [(g[77]  -  g[77])^]  _  F;[Var(g[77]|r)]  _  _ 1 


F;[(g[77])2] 


Var  9(77] 


1  +  Kal 


(6.14) 


Generally,  it  is  convenient  to  substitute  for  K  in  (6.14)  via  (6.9)  to  get 


l  +  crt 


R/W 


(6.15) 


where  r]F  «  1/2,  and  where  R/W  <  tjf  by  virtue  of  our  definition  of  t]f-  From  (6.15) 
we  see,  then,  that  for  R/W  iif,  the  MSB  is  given  asymptotically  by 


!  ^riF 
R/W 


(6.16) 


Note  that  fractal  modulation  performance  (6.15)  is  independent  of  the  parameter  H 
when  we  use  spectral  matching. 

For  comparison,  let  us  consider  the  MSB  performance  of  UMSB  modulation  with 
repetition-coding  in  the  presence  of  stationary  white  Gaussian  noise.  In  this  case, 
incorporating  redundancy  reduces  the  effective  rate-bandwidth  ratio  by  a  factor  of 


A',  ie., 


—  -  I]! 

W  K' 


(6.17) 


where  R  is  the  rate  at  which  the  symbols  q[n]  are  transmitted,  and  where  /70  is  the 
efficiency  of  the  modulation  without  coding,  which  is  unity.  The  optimum  Bayesian 
receiver  for  this  scheme,  using  a  minimum  MSB  criterion,  demodulates  the  repeated 
sequence,  and  averages  the  terms  corresponding  to  the  same  value  of  q[n]  to  generate 
q[n].  Hence  this  A'-fold  redundancy  leads  to  a  normalized  MSB  of 


2  _  E  |(}|n]  -  jM)^]  1 

£|(j[nl)21  1  +  ^iK 


(6.18) 


where  is  the  SNR,  i.e.,  the  ratio  of  the  power  in  q[n]  to  the  densitj'^  of  the  white 
noise  power  spectrum.  Combining  (6.18)  with  (6.17)  we  get 


1  + 


,2  m 

^R/W 


(6.19) 
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whenever  RJW  <  rjQ.  By  comparison  with  (6.16)  we  see  that  when  R/W  ■<  i]q. 


(6.20) 


which  is  essentially  (6.16)  since  the  achievable  rjo  is  unity  and  rjp  ^  1/2  as  discussed 
earlier.  This  means  that,  at  least  asymptotically,  the  performance  between  the  two 
schemes  is  comparable  in  the  presence  of  white  noise. 

This  behavior  is  reflected  in  the  performance  curves  for  both  fractal  modulation 
and  UMSE  modulation  with  repetition  coding  of  Fig.  6-5.  In  Fig.  6-5(a),  MSE  is 
shown  as  a  function  of  R/W  at  a  fixed  SNR  of  0  dB  (cr^  =  1),  while  in  Fig.  6-5(b),  MSE 
is  shown  as  a  function  of  SNR  at  a  fixed  R/W  =  0.1  symbols/sec/Hz.  As  we  expect, 
the  longer  the  channel  is  open,  or  the  greater  the  available  bandwidth  in  the  channel, 
the  better  the  performance  of  fractal  modulation.  Although  comparisons  between 
the  two  modulation  schemes  is  appropriate  only  for  the  special  case  of  additive  white 
Gaussian  noise  channels,  we  reiterate  that  the  performance  of  fractal  modulation 
(6.15)  is  independent  of  the  spectral  exponent  of  the  1/f  noise.  By  contrast,  we  would 
not,  in  general,  expect  (6.19)  to  describe  the  performance  of  UMSE  modulation  wdth 
repetition-coding  in  the  presence  1/f  noise. 


6.2.2  Minimum  Probability-of-Error  Demodulation 

In  this  section,  we  address  the  closely  related  problem  of  designing  and  evaluating 
optimal  receivers  for  bit-bj^-bit  signaling  using  fractal  modulation.  Specifically,  let  us 
consider  the  transmission  of  a  random  binary  data  stream  via  the  bi- valued  sequence^ 
q[n]  with  average  energy  Eq  per  bit; 


(6.21) 


Although  we  do  not  explicitly  consider  it  here,  the  extension  to  the  case  of  £-ary  signaling  is 
likewise  straightforward. 


173 


R/W  (symbols/sec/Hz) 

(a)  MSE  as  a  function  of  Rate/Bandwidth  ratio  R|^¥  at  0  dB  SNR. 


Figure  6-5:  Error-rate-bandwidth  tradeoffs  for  fractal  modulation  in  noise  with  the 
optimum  receiver.  The  solid  lines  represent  the  performance  of  fractal  modulation, 
while  the  dashed  lines  corresponds  to  the  performance  of  UMSE  modulation  with 
repetition  coding. 
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representing  a  1-bit  and  a  0-bit,  respectively.  Moreover,  as  before,  we  consider  a 
channel  characterized  by  additive  Gaussian  1//  noise. 

The  problem  of  optimally  decoding  each  bit  can  be  described  in  terms  of  a  binary 
hypothesis  test  on  the  observations  r(t),  or  equivalently,  the  coefficients  of  these 
observations  decomposed  into  the  wavelet  basis  used  in  the  synthesis  of  a;(t).  Choosing 
the  latter  of  the  options,  our  hypotheses  are 

/2^2m  (g22a) 

ffi-C  =  (6.22b) 

Avhere,  again,  are  the  wavelet  coefficients  of  the  noise  process.  Again  according 
to  our  wavelet-based  models  for  1/f  noise  of  Section  3.2.2,  under  each  hypothesis 
we  may  model  the  as  independent,  zero-mean  Gaussian  random  variables  with 
variances  given  by  (6.7).  Likewise,  based  on  rate  and  bandwidth  limitations  in  the 
channel,  the  available  observation  coefficients  are 


^  rn  €  M,n  €  A/’(?n)} 


where  the  ranges  M  and  M{m)  are  given  by  (6.11).  We  shall  also  assume  in  our 
analysis  that  the  ideal  bandpass  wavelet  basis  was  used  in  the  synthesis  ofx(t),  with 
the  implication  that  the  analysis  applies  moi’e  generally  to  a  broader  class  of  wavelets. 
The  likelihood  ratio 

PrlHi(r) 

associated  with  this  hypothesis  testing  problem  is  readily  derived,  and  can  be  reduced 
to  the  sufficient  statistic 


/  V  ^n+/A'  V^OP 

m=Mt  1=0 


(6.23) 


For  a  minimum  probability  of  error  Pr(£:)  receiver  and  a  I'andom  bit  stream  (i.e., 
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equally  likely  hypotheses),  the  optimum  test  can  be  derivetl  as 


Hi 

t  >  0, 

Ho 

which  is  intuitively  reasonable  from  symmetry  considerations.  Indeed,  £  is  condition¬ 
ally  Gaussian  under  each  hypothesis,  where  the  conditional  means  are 

E  [£\Hi]  =  -E  [e\Ho]  =  Kal  (6.24) 

and  the  conditional  variances  are 


Var  [£\Hi]  =  Var  {£\Ho]  =  Ka^, 


(6.25) 


where  is  the  SNR,  viz., 


Perfornaance 


The  bit-error  probability  associated  with  this  optimal  receiver  can  be  readily  derived 
as 

Pr(e)  =  Pr(f  >  0|H„)  =  Q  (js/i^f)  (6.26) 

where  Q(-)  is  defined,  as  in  Section  4.4,  by 


and  where  we  have  exploited  (6.24)  and  (6.25).  Substituting  for  A'  in  (6.26)  via  (6.9) 
we  can  express  this  error  probability  in  terms  of  the  rate-bandwidth  ratio  as 

Pr(£)  =  0 
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where  rjp  «  1/2.  Again,  the  performance  of  fractal  modulation  is  independent  of  the 
spectral  exponent  of  the  noise  process  when  we  use  spectral  matching.  Note,  too, 
that  for  R/W  <C  r]p^  the  bit  eri’or  probability  is  asymptotically 


Pr(e)  =  Q 


2 


2??f 

[RIW\ 


(6.28) 


For  UMSE  modulation  with  repetition  coding  in  the  presence  of  stationary  white 
Gaussian  noise,  to  decode  a  bit,  the  optimal  receiver  naturally  demodulates  the  re¬ 
ceived  data  and  averages  together  the  K  symbols  associated  with  the  transmitted  bit, 
thereby  generating  a  sufficient  statistic.  When  this  statistic  is  positive,  the  receiver 
decodes  a  1-bit,  and  a  0-bit  otherwise.  The  corresponding  performance  is,  therefore, 
given  by 

Pr(£)  =  Q  (iy^)  . 

Substituting  for  K  via  (6.17)  yields  the  error  expression 


Pr(F)  =  Q 


2\|  [i?/W 


(6.29) 


in  terms  of  the  rate-bandwidth  ratio,  where  770  =  1.  Comparing  (6.29)  with  (6.28), 
we  note  that  again  since  tjq  «  2r}p^  the  asymptotic  performance  of  the  two  schemes 
is  effectively  equi^nlent. 

This  is  apparent  in  Fig.  6-6,  where  we  plot  the  bit-error  performance  of  fractal 
modulation  in  bit-by-bit  signaling  along  side  corresponding  performance  of  UMSE 
modulation  with  repetition  coding.  In  Fig.  6-6(a),  Pr(£')  is  shown  as  a  function  of 
R/W  at  a  fixed  SNR  of  0  dB  =  1),  while  in  Fig.  6-5(b),  Pi/s)  is  shown  as  a 
function  of  SNR  at  a  fixed  R/W  =  0.1  bits/sec/Hz.  Both  these  plots  reveal  strong 
thresholding  behavior  whereby  the  error  probability  falls  off  dramatically  at  high  SNR 
and  low  R/W.  Although  the  comparison  between  the  two  modulations  is  generally 
appropriate  only  for  the  special  case  of  additive  white  Gaussian  noise  channels,  we 
reiterate  that  the  performance  of  fractal  modulation  (6.27)  is  independent  of  the 
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spectral  exponent  of  the  1//  noise.  By  contrast,  we  would  not  expect  (6.29)  to 
describe  the  performance  of  UMSE  modulation  with  repetition-coding  in  the  presence 
1//  noise  in  general. 


6.3  Outstanding  Issues 

This  chapter  constitutes  a  highly  preliminary  development  of  fractal  modulation. 
There  are  several  aspects  of  the  modulation  that  require  further  study  and  evalua¬ 
tion.  For  example,  of  significant  practical  interest  is  the  robustness  of  fractal  modu¬ 
lation  with  respect  to  channel  modeling  errors  and  numerical  implementation  errors. 
A  comprehensive  evaluation  of  sensitivity  to  such  errors  would  involve  both  Monte 
Carlo  simulations  with  synthetic  data,  and  experimental  implementations  with  real 
channels. 

It  is  also  important  to  consider  whether  fractal  modulation  constitutes  an  optimal 
modulation  strategy  in  some  sense.  Specifically,  while  the  scheme  is  apparently  well- 
suited  to  channels  of  the  type  illustrated  in  Fig.  6-2  in  which  the  bandwidth  and 
duration  are  unavailable  to  the  tran. emitter,  we  have  not  established  that  it  is  optimal 
with  respect  to  any  particular  performance  criterion.  This  question  certainly  deserves 
attention,  as  does  the  associated  problem  of  computing  the  capacity  of  channels  of 
this  type. 

In  addition,  there  are  a  number  of  issues  pertaining  to  fractal  modulation  that  need 
to  be  addressed  before  it  can  be  applied  in  various  contexts.  For  example,  accuiate 
synchronization  between  transmitter  and  receiver  would  appear  to  be  critical  in  the 
scheme,  perhaps  more  so  than  in  other  schemes.  Consequently,  we  would  anticipate 
that  the  development  of  techniques  for  achieving  the  necessary  synchronization  would 
be  important  for  any  practical  implementation.  Another  issue  pertains  to  the  use  of 
fractal  modulation  in  LPI  applications.  While  we  have  argued  that  the  second-order 
statistics  of  homogeneous  signals  can  be  effectively  indistinguishable  from  those  of 
1//  noises,  a  more  comprehensive  study  of  the  detectability  of  homogeneous  signals 
is  warranted.  In  particular,  it  would  be  important  to  study  the  vulnerability  of 
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R/W  (symbols/sec/Hz) 

(a)  Probability  of  error  Pr(e)  as  a  function  of  Rate/Bandwidth  ratio  R/W  at  0  dB  SNR. 

Figure  6-6:  Bit  error  probabilities  for  bit-by-bit  signaling  with  fractal  modulation 
over  noisy  channels  with  the  optimum  receiver.  Solid  lines  indicate  the  performance  of 
fractal  modulation,  while  dashed  lines  indicate  the  performance  of  UMSE  modulation 
with  repetition  coding. 
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fractal  modulation  to  traditional  nonlinear  feature  detection  techniques  including 
square- and-correiate  processing  [76]. 

This  leads  naturally  to  a  consideration  of  some  potentiall}'’  important  extensions 
to  the  basic  fractal  modulation  strategy.  For  example,  in  many  spread  spectrum  sce¬ 
narios,  it  is  important  to  obscure  the  spreading  technique.  In  direct-sequence  spread 
spectrum,  not  only  is  each  symbol  in  the  information  sequence  repeated  K  times 
prior  to  transmission,  but  the  redundant  stream  is  premultiplied  by  a  pseudorandom 
bit  stream  known  to  both  transmitter  and  receiver.  This  makes  the  transmitted  se¬ 
quence  appear  like  white  (or  at  least  broadband)  noise  to  a  listener  without  detailed 
knowledge  of  the  pseudorandom  sequence.  For  fractal  modulation,  the  analogous 
processing  might  involve  premultiplying  the  entire  wavelet  coefficient  field  .rj"  of  the 
transmitted  signal  x{t)  bj"  a  pseudorandom  bit  field  known  to  both  transmitter  and 
receiver.  In  turn,  this  would  appear  to  have  the  effect  of  making  .r(t)  appear  like 
l//-t3'pe  noise  to  a  listener  lacking  detailed  knowledge  of  the  i^seudorandom  bit  field. 
Such  an  extension  may  well  warrant  investigation. 

Another  extension  Avorth  considering  concerns  the  incorporation  of  efficient  coding 
techniques  for  use  Avith  fractal  modulation.  As  deA-eloped,  the  redundancy  in  the 
transmitted  signal  takes  the  form  of  multirate  repetition  in  the  time-frequency  plane. 
Perhaps  block  or  trellis  coding  techniques  [75]  can  be  exploited  in  improving  the  poAver 
efficiency  of  fractal  modulation.  At  first  glance,  it  would  seem  that  coding  of  this  type 
cannot  be  incorporated  Avithout  sacrificing  properties  of  the  transmission  scheme. 
NeA'^ertheless,  it  aa^ouM  be  important  to  clearly  establish  the  tradeoffs  iiwoh^ed. 

Finally,  fractal  modulation  would  be  of  potential  interest  in  a  greater  range  of  ap¬ 
plications  were  it  possible  to  accommodate  a  finer  lattice  of  rate-bandAvidth  tradeoffs 
than  those  prescribed  by  the  dyadic  grid  Ave  have  considered.  Indeed,  from  a  commu¬ 
nications  perspective,  it  is  rare  to  have  channels  Avhose  bandAvidth  spans  an  octaA^e, 
much  less  multiple  octaves.  However,  it  is  reasonable  to  expect  that  should  practical 
families  of  non-d5^adic  orthonormal  wavelet  bases  of  the  type  described  in  Section  2.2.7 
emerge,  the  concept  of  fractal  modulation  could  be  accordingly  extended. 

In  summarjf,  Avhile  fractal  modulation  is  not  yet  sufficiently  developed  that  it 
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constitutes  a  viable  communication  system  in  its  own  right,  it  does  represent  an 
interesting,  novel  and  potentially  important  paradigm  for  communication  in  many 
contexts. 
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Chapter  7 


Linear  Self-Similar  Systems 


This  thesis  has  explored  some  important  classes  of  statistically  and  deterministically 
self-similar  signals.  This  chapter  represents  a  preliminary  investigation  into  the  re¬ 
lationships  between  self-similar  signals  and  self-similar  systems.  In  particular,  we 
explore  not  only  how  we  may  interpret  some  of  our  methods  for  synthesizing  self¬ 
similar  signals  in  the  context  of  driven  self-similar  systems,  but  also  the  role  that  the 
wavelet  transform  plays  in  characterizating  such  systems.  In  the  end,  this  leads  to 
some  interesting  and  potentially  important  insights  and  perspectives  into  the  results 
of  the  thesis,  and  suggests  some  promising  future  directions  for  work  in  this  area. 

The  self-similar  systems  we  discuss  in  this  chapter  have  the  property  that  they  are 
linear  and  jointly  time-  and  scale-invariant.  In  the  first  half  of  the  chapter  we  define 
this  class  of  systems,  develop  several  properties,  and  show  Iioav  both  the  Laplace 
and  Mellin  transform  can  be  used  in  studying  these  systems.  In  the  latter  half  of  the 
chapter  we  develop  wavelet-based  characterizations  of  this  class  of  systems  to  illustrate 
that  the  wavelet  transform  is  in  some  sense  best  matched  to  these  systems — that 
such  characterizations  are  as  natural  and  as  useful  for  these  systems  as  Fourier-based 
characterizations  are  for  linear  time-invariant  systems.  We  first  briefly  review  some 
results  in  the  theory  of  linear  time-invariant  systems.  For  a  comprehensive  treatment, 
see,  e.g.,  [77]  or  [78]. 
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7.1  Linear  Time-Invariant  Systems 


Suppose  y{t),  yi{t),  and  y2{t)  are  the  responses  of  a  system  S  {•}  to  arbitrary  inputs 
x{t),  xi(t),  and  X2{t),  respectively.  Then  the  system  is  linear  when  it  satisfies  the 
superposition  principle 


5  {axi(t)  +  bx2{t)}  =  ayi{t)  +  by2{t)  (7.1) 

for  any  a  and  b,  and  time-invariant  when  it  satisfies 

S  {a;(t  -  r)}  =  y{t  -  r)  (7.2) 

for  any  constant  r.  Collectively  the  properties  (7.1)  and  (7.2)  characterize  a  linear 
time- invariant  (LTI)  system. 

A  linear  system  is  time-invariant  if  and  only  if  its  kernel 

K{t,T)  =  S  {6(t  -  r)} 


satisfies 

K{t,T)  —  K{t  —  b,r  —  b)  (7.3) 

for  any  6.  For  this  class  of  systems,  the  kernel  has  the  form 

K(t,  r)  =  v{t  —  r) 

where  v{t)  is  the  impulse  response  of  the  system.  Furthermore,  the  corresponding 
input-output  relation  is,  of  course,  the  usual  convolution, 

y{t)  =  f  x{T)v{t  —  T)dT  =  x{t)  *  ^{t). 

Jo 

The  eigenfunctions  of  LTI  systems  are  complex  exponentials  of  the  form  e^',  from 
which  we  get  that  the  Laplace  transform  possesses  a  convolution  property,  i.e.,  for 
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signals  x{t)  and  y{t)  with  Laplace  transforms  X(5)  and  y(5),  respectively,  we  have 


x{t)*y(t)  < — ^A'(5)y(s). 

7.2  Linear  Scale-Invariant  Systems 

In  contrast  to  linear  time-invariant  systems,  linear  sca/e-invariant  system  theory  has 
been  comparatively  less  explored,  though  it  has  received  occasional  attention  in  the 
systems  [79]  and  pattern  recognition  [80]  literature,  and  in  the  broader  mathematics 
literature  in  connection  with  the  Mellin  transform  [81]  [82]  [83]. 

Suppose  y{t)  is  the  response  of  a  system  <S  {•}  to  an  arbitrary  input  x{t).  Then  a 
system  {■}  is  said  to  be  scale-invariant  whenever 

S{x{t/T)}  =  y{t/T)  (7.4) 

for  any  constant  r  >  0.  When  there  is  risk  of  ambiguit}^,  we  will  refer  to  such  systems 
as  strict-sense  scale-invariant  systems  to  distinguish  them  from  generalized  scale- 
invariant  systems  we  will  develop  subsequently.  A  system  satisfying  both  (7.1)  and 
(7.4)  will  be  referred  to  as  a  linear  scale-invariant  (LSI)  system. 

It  is  straightforward  to  show  that  a  necessary  and  sufficient  condition  for  the  kernel 
K{t,  r)  of  a  linear  system  to  correspond  to  a  scale-invariant  system  is  that  it  satisfy 

K{t,T)  =  aK{at,aT)  (7.5) 


for  any  a  >  0. 

A  linear  scale-invariant  system  is  generally  characterized  in  terms  of  the  lagged- 
impulse  response  pair 


Ut)  =  <S{<!i(t-l)} 
Ut)  =  5{<5(t-Hl)}. 
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Indeed,  when  an  input  x{t)  can  be  decomposed,  except  at  t  =  0,  as 


we  can  exploit  the  superposition  principle  (7.1)  together  with  (7.4)  to  get  the  following 
input-output  relation 


y{t)  =  (i)  i  +^“x(-r)4.  (1) 


't\  dr 


For  simplicity  of  exposition,  we  will  restrict  our  subsequent  discussion  to  the  case 
of  causal  inputs 

x{t)  =  0,  t  <  0 

and  LSI  systems  whose  outputs  are  causal 


y{t)  =  S  {a:(t)}  =  0, 


t  <  0. 


From  the  development,  it  will  be  apparent  how  to  accommodate  the  more  general 
scenario  of  (7.7).  For  causal  signals,  the  input-output  relation  (7.7)  simplifies  to 

y(^)  =  ^  (^)  y  =  ^(0  (7-8) 

where  we  let  ^{t)  =  ^+{t)  to  simplify  our  notation,  and  where  we  use  the  symbol 
to  distinguish  this  convolutional  relationship  from  the  usual  convolution  *  associated 
with  LTI  systems.  Note  that  for  these  LSI  systems  the  kernel  is 

«(<.^)  =  7f(7)- 

This  new  convolution  operation  possesses  many  of  the  properties  of  the  usual  con¬ 
volution  operation.  For  example,  it  is  straightforward  to  show  that  it  is  commutative 
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for  well-behaved  operands,  i.e., 


x{t)~k^{t)  =  ^{t)'kx{t)  =  (7.9) 

As  a  consequence,  the  cascade  of  two  LSI  systems  with  lagged-impulse  responses  ^i(t) 
and  ^2(0)  respectively,  is  equivalent  to  a  single  system  with  lagged-impulse  response 
6(0  '*^6(0-  Furthermore,  the  systems  may  be  cascaded  in  either  order  without 
changing  the  overall  system. 

Likewise,  it  is  straightforward  to  show  that  the  new  convolution  operation  is  dis¬ 
tributive  for  well-behaved  operands,  i.e., 

x{t)  ★  {^i(t)  -f-  ^2(0}  =  ^(0  *6(0  +  a:(0  ★6(0-  (7-10) 

Hence,  the  parallel  connection  of  two  LSI  systems  with  lagged-impulse  responses  ^i{t) 
and  6(0)  respectively,  is  equivalent  to  a  single  sj^stem  with  lagged-impulse  response 
6(0 +  6(0- 

The  eigenfunctions  of  linear  scale-invariant  systems  are  homogeneous  functions  of 
degree  s;  specifically  they  are  the  complex  power  functions  defined  by 

x{t)=t\  (7.11) 

where  s  is  a  complex  number.  Indeed,  from  (7.8)  and  (7.9)  the  response  of  an  LSI 
system  to  (7.11)  is  readily  obtained  as 

with  the  associated  complex  eigenvalue  given  by 

=(0=/  dr  (7.12) 

y— oo 

whenever  this  integral  converges.  Eq.  (7.12)  is  referred  to  as  the  Mellin  transform  of 
the  signal  60- 
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The  eigenfunction  property  of  the  conaplex  power  functions  implies  that  the  Mellin 
transform  constitutes  an  important  tool  in  the  analysis  of  LSI  systems^  Indeed,  it  is 
particularly  convenient  to  compute  the  response  of  an  LSI  system  to  any  input  that 
is  the  superposition  of  eigenfunctions.  Fortunately,  a  broad  class  of  signals  x(t)  can 
be  expressed  as  a  superposition  of  eigenfunctions  of  LSI  systems  according  to 

x{t)  =  ^  X{s)t‘ds  (7.13a) 

ZTTJ  Jc—joo 

for  t  >  0,  where 

X{s)=  [  a;(r)r“*"^dr  (7.13b) 

Jo 

and  c  is  in  the  region  of  convergence  of  E(s).  Eqns.  (7.13)  collectively  constitute 
the  Mellin  representation  of  a  signal  x{t):  the  Mellin  inverse  formula  (7.13a)  is  the 
synthesis  relation,  while  the  Mellin  transform  (7.13b)  is  the  analysis  formula.  Inter¬ 
estingly,  we  may  interpret  the  Mellin  transformation  as  a  representation  of  x{t)  by 
its  “fractional”  moments. 

The  Mellin  inverse  formula  implies  that  a  broad  class  of  linear  scale-invariant 
systems  are  completely  characterized  by  the  Mellin  transform  of  their  lagged-impulse 
response,  E(s).  Consequently,  we  can  refer  to  this  quantity  as  the  system  function 
associated  with  the  LSI  system.  As  a  consequence  of  the  eigenfunction  property  of 
the  complex  power  functions,  the  input-output  relation  for  a  linear  scale-invariant 
system  with  system  function  H(s)  can  be  expressed  in  the  Mellin  domain  as 

y(s)  =  =(»)X(s)  (7.14) 

whenever  both  terms  on  the  right-hand  side  have  a  common  region  of  convergence. 
Hence,  via  the  Mellin  transform,  we  can  map  our  convolution  operation  (7.8)  into  a 
convenient  multiplicative  operation  (7.14). 

The  Mellin  transform,  its  inversion  formula,  properties,  and  numerous  transform 

^  Actually,  we  have  chosen  a  slight  but  inconsequential  variant  of  the  Mellin  transform — the  usual 
Mellin  transform  has  s  replaced  by  —  s  in  our  definition. 
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pairs  are  well-documented  in  the  literature.  See,  e.g.,  [83]  [84]  [85].  One  basic  Mellin 
transform  pair  is  given  by 

r"“a(t-l)  < — »  — ,  Re{s)>-SQ  (7.15) 

s  +  So 

for  arbitrary  sq- 

From  this  pair  we  are  able  to  show  that  the  Mellin  transform  plays  an  important 
role  in  the  solution  of  a  class  of  scale-differential  equations  that  give  rise  to  linear 
scale-invariant  systems.  We  begin  by  quantifying  the  notion  of  a  “derivative  operator 
in  scale.”  A  reasonable  definition  of  the  derivative  in  scale  of  a  signal  x{t)  is  given  by 


Vsxit)  =  lirn 


x{£t)  —  x{t) 
Ins 


One  can  readily  interpret  this  definition  in  the  context  of  traditional  derivatives  as 


Differentiation  in  scale  corresponds  to  a  multiplication  by  s  in  the  Mellin  domain, 
which  suggests  that  the  Mellin  transform  can  be  used  to  efficiently  solve  what  can 
be  described  as  a  class  of  “dynamical  systems  in  scale.”  ^  Consider  the  following 
A^th-order  linear  constant-coefficient  scale-diflPerential  equation 

N  M 

/b=0  fc=0 

where  we  denote  the  A;th  derivative  in  scale,  obtained  by  iterative  apiDlication  of 
the  derivative  operator,  by  V/.  Then,  via  the  convolution  property  of  the  Mellin 
transform,  we  obtain 

y(s)  =  E(s)X(s) 

^We  remai'k  that  this  development  raises  some  interesting  questions  regarding  connections  to  the 
more  general  literature  that  is  evolving  on  multiscale  systems  [86]  [58].  Such  relationships,  however, 
have  yet  to  be  explored. 


where  E(s)  is  rational,  i.e., 


fc=0 

N  ’ 

n  ^ks’‘ 


fc=0 


in  the  corresponding  region  of  convergence.  The  usual  partial  fraction  expansion 
approach,  together  with  Mellin  pairs  of  the  form  (7.15),  can  be  used  to  derive  y{t) 
from  its  Mellin  transform. 

It  is  interesting  to  note  that  in  the  1950s,  Gerardi  [79]  developed  such  an  approach 
for  the  synthesis  and  analysis  of  time- varying  networks  governed  by  scale-differential 
and  Euler-Cauchy  equations,  although  he  did  not  recognize  the  relationship  to  a 
linear  scale-invariant  system  theory.  In  addition,  he  also  derived  the  convolution 
relationship  (7.8). 

Before  we  turn  our  attention  to  a  more  broadly  defined  class  of  LSI  systems,  we 
remark  that  there  is,  in  fact,  a  natural  homomorphism  between  linear  scale-invariant 
and  linear  time-invariant  (LTI)  systems.  This  relationship  allows  us  to  derive  virtually 
all  the  results  described  in  this  section,  in  addition  to  many  others,  by  mapping  corre¬ 
sponding  properties  from  the  theory  of  LTI  systems.  Specifically,  by  replacing  time  t 
with  exponential  time  e\  we  find,  for  example,  that  LSI  systems  become  LTI  systems, 
complex  power  functions  become  complex  exponentials,  the  Mellin  transform  becomes 
the  bilateral  Laplace  transform,  and  linear  constant-coefficient  scale-differential  equa¬ 
tions  become  familiar  linear  constant-coefficient  differential  equations. 


7.2.1  Generalized  Linear  Scale- Invariant  Systems 

Suppose  y{t)  is  the  response  of  a  system  S  {•}  to  an  arbitrary  input  a:(t).  Then  we 
shall  say  the  system  *S  {•}  is  scale-invariant  with  parameter  A  whenever 

5  {x{tfT))  =  T^y(t/T)  (7.16) 

for  any  constant  r  >  0.  We  will  denote  systems  that  satisfy  the  superposition  prin¬ 
ciple  (7.1)  and  tht  generalized  scale- invariance  relation  (7.16)  as  LSI(A)  systems. 
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Obviously,  strict-sense  LSI  systems  correspond  to  the  special  case  A  =  0.  It  can  be 
easily  established  that  a  necessary  and  sufficient  condition  for  a  linear  system  to  be 
scale- invariant  with  parameter  A  is  that  the  kernel  satisfy 

Ac(t,r)  =  a~^^~^^K{at,ar)  (7.17) 


for  any  a  >  0. 

Such  generalized  linear  scale-invariant  systems  are  also  completely  characterized 
in  terms  of  their  lagged-impulse  response  pair 

f+(()  =  S{«(t-1)} 

?_(()  =  5  {«((  +  !)}. 

Again,  when  we  are  able  to  decompose  our  input  according  to  (7.6)  and  restrict 
our  attention  to  the  case  of  causal  signals,  we  can  exploit  (7.1)  and  (7.16)  to  get  the 
following  input-output  relation 

y{t)  =  l  (7-i8) 

Rewriting  (7.18)  as 

y(i)  =  I"  {x(r)T^}  {  (i)  ^ 

we  observe  that  any  LSI(A)  system  can  be  implemented  as  the  cascade  of  a  sys¬ 
tem  which  multiplies  the  input  by  |<|^,  followed  by  a  strict-sense  LSI  system  with 
lagged-impulse  response  ^(f).  However,  in  many  cases,  this  may  not  be  a  particularly 
convenient  implementation,  either  conceptually  or  practically. 

7.3  Linear  Time-  and  Scale-Invariant  Systems 

We  will  say  that  a  system  is  linear  time-  and  scale-invariant  with  parameter  A,  de¬ 
noted  LTSI(A),  whenever  it  jointly  satisfies  the  properties  of  superposition  (7.1), 
time-invariance  (7.2)  and  generalized  scale-invariance  (7.16).  In  this  case,  the  time- 
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invariance  constraint  (7.3)  requires  the  kernel  to  be  of  the  form 


K{t,r)  =  v{t  —  t) 

for  some  impulse  response  v{-),  while  the  scale-invariance  constraint  (7.17)  imposes 
that  the  impulse  response  be  a  generalized  homogeneous  function  of  degree  A  —  1,  ie., 

v{t)  =  v{at) 

for  all  t  and  all  a  >  0.  Following  Gel’fand,  et  al.  [67],  we  can  parameterize  the 
entire  class  of  impulse  responses  for  such  systems.  In  particular,  provided  A 
0,-1,  —2,  . . .  ,  we  get  that  t>(t)  takes  the  form 

v{t)  =  u{t)  -1-  C2ltl^~^  u{-t).  (7.19a) 

For  the  special  case  A  =  — n  for  n  =  0, 1, 2,  . . ., 

v{t)  =  u{t)  +  u{-t)  +  C5(5(")(t)  (7.19b) 

where  denotes  the  nth  derivative  of  the  unit  impulse  and  u(t)  the  unit  step 

function.  In  both  cases,  the  Ci,  ...  C5  are  arbitrary  constants. 

There  are  many  familiar  LTSI(A)  systems.  For  example,  the  identity  s)"stem,  for 
which 

v{t)  =  S(t) 

corresponds  to  A  =  0,  C3  =  C4  =  0  and  C5  =  1.  In  fact,  as  is  apparent  from  the 
parameterizations  (7.19),  the  identity  system  is  the  only  stable  LTSI(A)  system.  A 
second  example  is  the  integrator.  This  system  has  a  regular  impulse  response 

v{t)  =  u{t) 

and  corresponds  to  A  =  1,  Ci  =  1,  and  C2  =  0.  As  a  final  example,  consider  a 
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differentiator,  which  has  for  an  impulse  response  the  unit  doublet 

vit)  =  m. 

This  choice  corresponds  to  A  =  —1,  C3  =  C4  =  0,  and  €5  =  1. 

Linear  time-  and  scale-invariant  systems  are  natural  candidates  for  modeling  and 
processing  self- similar  signals  as  we  begin  to  show  in  the  next  section. 

7.3.1  Self-Similar  Signals  and  LTSI(A)  Systems 

In  this  section,  we  explore  some  relationships  between  self-similar  signals  and  systems. 
In  particular,  we  show  how  LTSI(A)  systems  preserve  the  time-invariance  and  scale- 
invariance  of  their  inputs,  and  point  out  how  these  properties  have  been  exploited  in 
some  of  the  models  for  self-similai-  signals  described  earlier  in  the  thesis. 

Our  result  in  the  determinisic  case  is  as  follows.  Let  v{t)  be  the  impulse  response 
of  an  LTSI(A)  system,  so  that  v{t)  is  homogeneous  of  degree  A  —  1,  i.e.,  for  any  a  >  0 

v{t)  =  v{at), 

and  consider  driving  the  system  with  a  scale-invariant  input  signal  a;(t)  that  is  homo¬ 
geneous  of  degree  H.  Then  it  is  straightforward  to  establish  that  the  output  y{t)  of 
the  system 

/oo 

x{T)v{t  —  r)  dr 

■OO 

when  well-defined,  is  scale-invariant  as  well.  In  fact,  it  is  homogeneous  of  degree 
H  +  X,  so  that,  for  any  o  >  0 


y{t)  =  a  y{at).  (7.20) 

There  are  two  obvious  special  cases.  The  first  corresponds  to  the  case  in  which 
the  system  is  the  identity  system  (A  =  0).  Here  the  output  and  input  are  identical, 
and  (7.20)  yields  the  appropriate  result.  The  second  corresponds  to  case  in  which 
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the  input  is  an  impulse  {H  =  —1).  Here,  the  output  is  v{t),  and,  again,  (7.20)  yields 
the  correct  result.  This,  of  course,  suggest  that  at  least  one  synthesis  for  a  class  of 
homogeneous  signals  is  in  terms  of  an  LTSI(A)  system  driven  by  an  impulse. 

Note  that  we  can  derive  analogous  results  for  deterministically  time-invariant 
inputs.  However,  this  time  the  results  are  somewhat  degenerate.  In  particular,  except 
in  trivial  cases,  for  a  time-invariant  (i.e.,  constant)  input,  the  output  of  such  a  system 
is  only  well-defined  if  the  system  is  an  identity  system  since  any  other  LTSI(A)  system 
is  unstable.  Nevertheless,  in  this  unique  case,  the  output  is,  obviously,  time-invariant 
as  well. 

Consider,  next,  the  case  of  an  input  that  is  either  wide-sense  or  strict-sense  statis¬ 
tically  scale-invariant  as  defined  in  Chapter  3.  In  this  case,  it  is  also  straightforward 
to  show  that  the  output,  when  well-defined,  is  also  statistically  scale-invariant  and 
satisfies 

y{t)  I  y{at) 

with  equality  in  the  corresponding  sense. 

For  wide-  or  strict-sense  stationary  (i.e.,  statistically  time-invariant)  inputs,  the 
outputs,  when  well-defined,  are  also  stationary.  This  is,  of  course,  a  well-known  result 
from  LTI  system  theory.  Note,  however,  that,  again  from  stability  considerations,  the 
only  non-trivial  system  for  which  the  output  is  well-defined  is  the  identity  system. 
This  implies,  for  instance,  that,  in  general,  when  driven  with  stationary  white  noise, 
the  outputs  of  such  systems  are  not  well-defined. 

Many  of  these  issues  surfaced  in  Chapter  3,  where  we  considered  the  modeling 
of  1//  processes  through  a  synthesis  filter  formulation.  Specifically,  the  system  with 
impulse  response  (3.9)  we  first  proposed  as  a  synthesis  filter  for  1/f  processes  is 
precisely  an  example  of  an  LTSI(A)  system  with  A  =  /f -1-1/2.  A  similar  filter  with  A  = 
H-l/2  appears  in  the  conceptual  synthesis  for  fractional  Brownian  motion  illustrated 
in  Fig.  3-2.  Furthermore,  the  fractional  integrator  used  in  the  Barnes- Allan  synthesis 
for  1//  processes,  which  we  described  in  Section  3-2,  has  properties  similar  to  those  of 
an  LTSI(A)  system.  More  generally,  there  would  appear  to  be  a  number  of  potentially 
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important  connections  between  the  operators  of  fractional  calculus  [41]  and  linear 
jointly  time-  and  scale-invariant  system  theory.  Finally,  the  ARMA  filter  used  by 
Keshner  in  his  synthesis  of  l//-like  behavior,  which  we  discussed  in  Section  3.2.1, 
can  be  viewed  as  an  approximation  to  an  LTSI(A)  system.  Specifically,  this  filter 
is  linear  and  time-invariant,  but  satisfies  the  scale-invariance  relation  (7.16)  only  for 
dilation  factors  r  of  the  form  r  =  A'". 

The  impulse-response  constitutes  one  important  means  of  characterizing  LTSI(A) 
systems.  However,  from  an  implementational  perspective,  it  is  not  always  most  conve¬ 
nient.  In  the  next  section,  we  develop  a  canonical  representation  for  a  class  of  LTSI(A) 
systems  in  terms  of  wavelet  bases.  As  we  shall  see,  this  characterization  not  only  pro¬ 
vides  additional  insight  into  such  systems,  but  ultimately  leads  to  some  important 
techniques  for  realizing  and  approximating  such  systems.  In  fact,  we  Avill  see  that 
it  is  possible  to  interpret  many  of  the  wavelet-based  representations  for  self-similar 
signals  we  derived  in  Chapters  3  and  5  in  the  context  of  these  results. 


7.4  Wavelet-Based  LTSI(A)  Systems 

Consider  a  system  which  computes  the  wavelet  transform  of  the  input  x(t)  via  (2.1), 
multiplies  the  resulting  field  by  some  regular  field  Kj^  in  time-scale  space,  then 
inverts  the  result  according  to  the  synthesis  formula  (2.4),  so  that 

j,W  =  W-‘{A-;W{a:(())).  (7.21) 

It  is  straightforward  to  establish  that  such  a  linear  system  has  a  kernel 

1  roo  roo 

<t,  r)  =  -—  /  C(<)  Ki:  (7.22) 

J—oo  J-^oo 

The  structure  of  this  kernel  imposes  certain  constraints  on  the  linear  system;  for 
example,  such  systems  are  symmetric,  i.e., 

k{t,T)  =  k{r,t).  C?'-23) 
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However,  the  structure  of  the  kernel  is  sufficiently  general  that  one  can  implement 
LTI,  LSI,  or  LTSI  systems  using  this  framework. 

For  instance,  using  the  readily  derived  identity 


valid  for  all  6,  the  system  will  be  time-invariant  (i.e.,  satisfy  (7.2))  whenever  the 
multiplier  field  satisfies 

KS  =  K-, 

for  all  b.  In  other  words,  (7.21)  implements  an  LTI  system  whenever  the  field  is 
independent  of  u.  In  this  case,  can  be  expressed  as 


k;  =  k{fi) 


(7.24) 


for  some  regular  function  of  scale  k{-). 
Likewise,  using  the  identity 


Tpliiat)  =  \a\  (7.25) 

valid  for  a,  the  system  will  be  scale- invariant  with  parameter  A  {i.e.,  satisfy  (7.16)) 
whenever  the  multiplier  field  satisfies 


/<■;  =  a-^KZ 


(7.26) 


for  all  a  >  0. 

For  the  system  to  be  jointly  time-  and  scale-invariant  with  parameter  A,  (7.24) 
and  (7.26)  require  that 

k{fj,)  =  a~^k{aii), 

i.e.,  that  fc(-)  be  homogeneous  of  degree  A.  The  imposition  of  regularity  on  fc(-)  pre¬ 
cludes  it  from  containing  impulses  or  derivatives  of  impulses.  Again  using  Gel’fand’s 
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parameterization  of  the  homogeneous  functions,  we  conclude  that  the  system  (7.21) 
is  LTSI(A)  whenever  tlie  multiplier  field  has  the  form 

=  klii)  =  «(^)  +  CM"  “(-/■)  (7-25’) 

for  some  constants  Ci  and  €2-  Note  that  even  if  these  constants  are  chosen  so  that 
k(-)  is  asymmetric,  the  impulse  response  v{t)  of  the  resulting  system  will  be  even: 

v{t)  =  v{—t).  (7.28) 

This  is  a  consequence  of  the  symmetry  constraint  (7.23).  In  fact,  since  we  can  rewrite 
(7.22)  using  (7.25)  as 

I  foo  roo 

«(<.7-)  =  TT-  dv  j  V’£((*)  [K^)  + 

J—oc  JO 

we  see  that  the  kernel  of  the  system  is  really  only  a  function  of  the  even  part  of  the 
function  k(-).  Hence,  without  loss  of  generality  we  may  set  C2  =  0  in  (7.27)  and 
choose 

Ki;  =  kifi)  =  Cfi^u{ij)  (7.29) 

where  C  is  an  arbitrary  constant. 

Finally,  combining  (7.28)  with  (7.19),  we  can  conclude  that  whenever  (7.21)  im¬ 
plements  an  LTSI(A)  system,  i.e.,  whenever  k(-)  is  chosen  according  to  (7.29),  the 
impulse  response  of  (7.21)  must  take  the  form 

vit)  = 

whenever  A  7^  0,  —2,  —4,  . . .  ,  or  the  form 


i;(t)  = 


whenever  A  =  — n  for  n  =  0,2,4,  —  In  both  cases,  Cr  and  Cs  are  parameters 
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x(t) 


■  y(t) 


C\ru(\i) 


Figure  7-1:  Wavelet-based  implementation  of  an  LTSI(X)  system. 


determined  by  the  constant  C  in  (7.29).  Note,  in  particular,  that,  at  least  for  the 
case  A  =  0  we  must  have 


C.  =  C 
Cr  =  0. 

This  follows  from  the  fact  that,  since  =  C,  the  overall  system  (7.21)  is  just  a 
scaled  identity  system.  Fig.  7-1  summarizes  the  resulting  wavelet-based  realization 
of  a  linear  jointly  time-  and  scale-invariant  system  with  parameter  A.  Note  that 
this  is  analogous  to  implementing  an  LTI  system  by  computing  the  Fourier  trans¬ 
form  of  the  input,  multiplying  by  some  frequency  response,  and  applying  the  inverse 
Fourier  transform  to  the  result.  As  is  the  case  for  Fourier-based  implementations  of 
LTI  systems,  not  all  LTSI(A)  systems  may  be  realized  using  the  wavelet-based  im¬ 
plementation  of  Fig. 7-1.  For  example,  the  symmetry  constraint  (7.28)  precludes  us 
from  being  able  to  implement  either  the  differentiator  or  integrator  system  examples 
discussed  in  Section  7.3  since  these  systems  have  impulse  responses  that  are  odd. 

As  a  final  remark,  it  is  important  to  emphasize  that  the  actual  choice  of  wavelet 
basis  plays  no  significant  role  in  the  representation  of  LTSI(A)  systems  discussed  in 
this  section.  However,  while  the  choice  of  basis  does  not  enter  into  the  theoretical  de- 
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velopment,  it  is  reasonable  to  expect  it  to  be  a  factor  in  any  practical  implementation. 
In  the  next  section,  we  consider  a  strategy  for  approximating  LTSI(A)  s3’^stems  that 
exploits  orthonormal  wavelet  bases.  As  we  shall  see,  these  quasi-LTSI(A)  systems  are 
particularly  convenient  to  implement  and  can  be  made  computationally  efficient. 


7.4.1  Dyadic  Approximations  to  LTSI(A)  Systems  Based  on 
Orthonormal  Wavelet  Bases 

A  practical  approximation  to  a  linear  time-scale  invariant  system  can  be  constructed 
via  orthonormal  wavelet  bases  of  the  type  described  in  Section  2.2.  Because  signal 
reconstructions  in  terms  of  such  bases  require  onh'  a  countable  collection  of  samples  of 
the  wavelet  coefficient  field,  the  system  turns  out  to  be  fundamentally  more  practical 
from  an  implenientational  perspective.  In  addition,  using  an  implementation  based 
on  the  DWT,  the  system  can  be  made  computationally  highly  efficient  as  well.  In  fact, 
in  some  sense,  using  the  discrete  wavelet  transform  to  implement  an  LTSI(A)  system 
is  analogous  to  implementing  an  LTI  system  using  the  discrete  Fourier  transform 
(DFT). 

Consider  a  s5^stem  which  computes  the  orthonormal  wavelet  decomposition  of  the 
input  x{t)  according  to  the  analysis  formula  (2.5b),  i.e., 


dt, 


scales  the  resulting  collection  of  wavelet  coefficients  by  a  factor  k^, 


then  re-synthesizes  a  signal  from  these  modified  coefficients  to  generate  an  output 
according  to  the  synthesis  formula  (2.5a),  i.e., 

!/w=ei:<cw- 

m  n 
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It  is  a  straightforward  exercise  to  show  that  the  overall  system,  described  via 

m=m'KyvA<t)))<  aso) 

corresponds  to  a  symmetric  linear  system  with  kernel 

m  n 

A  close  inspection  reveals  that,  as  a  consequence  of  the  nature  of  the  discretization 
inherent  in  the  system,  one  cannot  choose  the  multiplier  coefficients  such  that  the 
resulting  system  is  time- invariant.  Likewise,  one  cannot  choose  the  coefficients  so 
that  the  ovei'all  system  is  scale-invariant  for  any  degree  A.  However,  we  can  show 
that  if  the  are  chosen  in  a  manner  consistent  with  the  discussion  of  the  previous 
section,  viz., 

^1.2-  = 

then  the  system  defined  via  (7.30)  obeys  some  associated  notions  of  time-  and  scale- 
invariance. 

We  begin  by  noting  that  this  system,  which  is  depicted  in  Fig.  7-2,  has  a  kernel 
satisfying 

k{t,  r)  =  2'"r)  (7.32) 

for  any  m,  where  we  have  used  the  identity 


C(2‘i)  = 


valid  for  any  integer  i.  However,  since  (7.32)  can  be  restated  in  terms  of  the  general¬ 
ized  scale  invariance  condition  (7.17)  according  to 

K(t,  t)  =  ar),  a  =  2"* 

we  see  that  the  system  obeys  a  weaker,  dyadic  scale  invariance  condition.  In  partic- 
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Figure  7-2:  A  dyadic  approximation  of  a  LTSI(X)  system  as  implemented  via  an 
orthonormal  wavelet  basis. 


ular,  the  system  satisfies  (7.16)  only  for  dilations  factors  r  of  the  form 


T  =  2”* 


for  integers  m. 

Likewise,  the  system  obeys  a  somewhat  weaker  time-invariance  property.  Consider 
a  class  of  input  signals  x{t)  to  the  system  which  have  no  detail  at  scales  coarser  than 
2^^  for  some  integer  M,  so  that 

X™  =  0,  m  <  M. 

Then,  the  multiplier  coefficients  for  m  <  M  for  the  system  are  irrelevant  and  we 
may  arbitrarily  assume  them  to  be  zero.  For  this  class  of  inputs  then,  the  effective 
kernel  is 

Ka(t.r)=  Y:  ECWC2-»’”C(^)- 
m>M  ” 

Using  the  identity 
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valid  whenever  for  m  >  M  and  I  an  integer,  we  see  that  this  kernel  satisfies 

k{t,  t)  =  k{t  -  12-^,  T  -  12-^)  (7.33) 

for  all  integers  1.  Since  (7.33)  can  be  re-expressed  as 

k(<,  r)  =  k{t  —  b,T  —  6),  b  =  12~^ 

we  see  that  or  this  class  of  input  signals  the  system  is  periodically  time-invariant^ 
i.e.,  satisfies  (7.2)  for  any  shift  factor  r  of  the  form 

r  =  ;2~^,  1=  ...  ,-1,0, 1,2,  .... 

Note  that  the  actual  choice  of  wavelet  affects  the  characteristics  of  the  overall 
S5'^stem  in  contrast  to  the  wavelet-based  systems  discussed  in  the  previous  section. 
Indeed,  with  respect  to  scaling  behavior,  the  choice  of  wavelet  will  affect  hov;  the 
system  behaves  under  non-dyadic  scale-changes  at  the  input.  Furthermore,  the  choice 
of  wavelet  will  affect  the  class  of  inputs  for  which  our  time-invariance  relation  is 
applicable,  as  well  as  the  behavior  of  the  system  under  input  translations  that  are 
not  multiples  of  2~^. 

In  summary,  this  chapter  has  used  a  system  theory  perspective  in  to  suggest  a 
unifying  framework  for  viewing  various  results  of  the  thesis.  For  example,  our  re¬ 
sults  on  LTSI(A)  systems,  their  properties,  and  their  wavelet-based  representations 
suggest  that  wavelet-based  synthesis,  analysis,  and  processing  of  self-similar  signals 
is  rather  natural.  Indeed,  the  1//  synthesis  and  whitening  filters  described  in  Sec¬ 
tion  4.1  are  specific  examples  of  the  quasi-LTSI(A)  systems  developed  in  this  section. 
As  discussed,  these  filters  play  an  important  role  in  addressing  problems  of  optimal 
detection  and  estimation  involving  1//  processes.  It  is  conceivable  that  the  trans¬ 
mitter  and  receiver  structures  for  fractal  modulation  discussed  in  Chapter  6  can  also 

^Actually,  the  multirate  systems  literature  typically  refers  to  such  systems  as  periodically  time- 
varying  since  they  can  be  modeled  as  LTI  systems  whose  parameters  vary  periodically  in  time. 
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be  interpreted  in  terms  of  such  quasi-LTSI(A)  systems.  Such  an  interpretation  could 
lead  to  some  potentially  useful  additional  insights  into  both  homogeneous  signals  and 
fractal  modulation. 

More  generally,  a  system  theory  perspective  provides  some  novel  insights  into  the 
relationships  between  Laplace,  Fourier,  Mellin  and  wavelet  transformations,  both  as 
signal  analysis  tools  and  as  representations  for  characterizing  linear  systems.  In  par¬ 
ticular,  we  have  shown  results  suggesting  that  while  Laplace  transforms  are  naturally 
suited  to  the  analysis  of  linear  time-invariant  systems,  and  while  Mellin  transforms  are 
naturally  suited  to  the  analysis  of  scale-invariant  systems,  it  is  the  wavelet  transform 
that  plays  the  corresonding  role  for  linear  systems  that  are  simulataneously  time-  and 
scale-invariant.  However,  this  chapter  represents  a  rather  preliminaiy  exploration  of 
self-similar  systems  and  many  topics  and  issues  remain  to  be  explored.  Indeed,  these 
represent  some  interesting  and  potentially  fruitful  directions  for  further  research. 
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Chapter  8 


Conclusions  and  Future  Directions 


In  this  thesis  we  have  developed  robust,  efficient,  and  convenient  wavelet-based  rep¬ 
resentations  for  two  distinct  families  of  fractal  signals,  and  explored  some  potentially 
important  applications. 

For  the  1//  family  of  fractal  signals,  we  established  that  orthonormal  wavelet 
expansions  generally  play  the  role  of  Karhunen-Loeve-like  expansions.  That  is,  when 
1//  processes  are  expanded  in  terms  of  orthonormal  wavelet  bases,  the  coefficients  of 
the  expansion  tend  to  be  very  weakly  correlated.  This  conclusion  was  supported  both 
with  theoretical  evidence  and  with  empirical  evidence  based  upon  real  and  simulated 
data.  Many  of  the  results  exploited  a  new  frequency-based  characterization  for  1// 
processes  introduced  in  the  thesis. 

Using  this  Karhunen-Loeve-type  representation  for  1//  processes  we  were  able  to 
solve  many  fundamental  but  previously  intractable  problems  of  detection  and  esti¬ 
mation  involving  such  processes.  For  example,  we  were  able  to  develop  maximum 
likelihood  parameter  estimation  algorithms  for  1//  signals  embedded  in  white  back¬ 
ground  noise.  We  emphasize  that  while  other  parameter  estimation  algorithms  for 
Iff  processes  exist,  our  work  is  the  first  we  are  aware  of  to  explicitly  take  into  ac¬ 
count  the  presence  of  background  noise.  As  we  pointed  out,  such  noise  is  important 
in  ensuring  that  the  algorithms  are  robust  with  respect  to  both  measurement  noise 
and  modeling  errors.  The  resulting  parameter  estimation  algorithms,  which  compute 
both  fractal  dimension  parameters  and  the  accompanying  signal  and  noise  variance 
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parameters,  can  be  used  for  signal  and  texture  classification  or  for  signal  detection  in 
a  number  of  possible  applications. 

We  were  also  able  to  develop  what  appear  to  be  the  first  minimum  mean-square 
error  signal  estimation  (smoothing)  algorithms  for  1  //  processes  corrupted  by  white 
noise.  There  are  many  potential  problems  involving  signal  restoration  and  enhance¬ 
ment  to  which  the  resulting  wavelet-based  Wiener  filters  may  be  applied. 

As  examples  of  other  signal  processing  problems  that  can  be  addressed  using  our 
new  representation,  we  developed  algorithms  for  coherent  detection  in  the  presence 
of  1//  and  white  noise,  and  algorithms  for.  discriminating  between  distinct  noise- 
corrupted  1//  signals.  The  detection  algorithms  are  potential!}"  important  in  a  num¬ 
ber  of  communication  and  telemetry  applications,  while  the  discrimination  algorithms 
can  be  used  for  signal  classification  in  a  number  of  application  contexts. 

Wlrile  a  considerable  amount  of  additional  analysis  and  testing  of  these  various 
algorithms  remains  to  be  done,  our  preliminary  theoretical  and  empirical  studies 
suggest  that  they  are  not  onl}"  robust,  but  highly  computationally  efficient  as  well. 
Indeed,  the  existence  of  high-speed  discrete  wavelet  transform  algorithms  means  that 
wavelet  representations  are  of  both  significant  theoretical  and  practical  importance. 

In  the  second  half  of  the  thesis,  we  introduced  and  developed  a  novel  family  of 
homogeneous  signals  that  were  defined  in  terms  of  a  dyadic  scale-invariance  property. 
W'e  distinguished  between  two  classes,  energy-dominated  and  power-dominated,  and 
established  that  orthonormal  wavelet  basis  representations  are  as  important  for  these 
signals  as  they  are  for  1//  signals.  In  particular,  choosing  a  suitable  wavelet-based 
inner  product,  we  showed  that  the  energy-dominated  homogeneous  signals  consti¬ 
tute  a  Hilbert  space  for  which  we  were  able  to  construct  wavelet-based  orthonormal 
“self- similar”  bases.  We  discussed  the  spectral  and  fractal  properties  of  homogeneous 
signals,  and  developed  efficient  discrete-time  algorithms  for  their  synthesis  and  anal¬ 
ysis. 

As  one  potential  application  direction,  we  considered  the  used  of  homogeneous  sig¬ 
nal  sets  in  a  communications-based  context.  In  particular,  we  developed  a  strategy 
for  embedding  information  sequences  into  homogeneous  waveforms  that  we  termed 
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fractal  modulation.  With  fractal  modulation,  the  transmitted  waveforms  have  the 
property  that  the  information  can  be  recovered  given  either  arbitrarily  little  duration 
or  arbitrarily  little  bandwidth.  More  generally,  we  were  able  to  show  that  the  resulting 
multirate  modulation  scheme  is  well-suited  for  use  with  noisy  channels  of  simultane¬ 
ously  uncertain  duration  and  bandwidth.  As  we  discussed,  this  model  is  useful  not 
only  in  describing  a  variety  of  physical  channels,  but  also  in  capturing  receiver  con¬ 
straints  inherent  in  many  point-to-point  and  broadcast  communication  scenarios.  We 
developed  both  minimum  mean-square  error  receivers  for  demodulating  continuous- 
amplitude  information  and  minimum  probability-of-error  receivers  for  demodulating 
discrete-amplitude  information.  In  each  case,  we  evaluated  the  performance  of  the 
overall  communication  scheme  and  made  comparisons  to  more  traditional  modulation 
strategies. 

While  our  development  of  fractal  modulation  considered  many  issues,  including 
fiirite  message  length  effects  and  computational  complexity,  many  other  issues,  such  as 
synchronization  and  buffering,  remain  to  be  adequately  explored.  As  a  consequence, 
fractal  modulation  does  not  yet  constitute  a  fully-developed  communication  protocol. 
Nevertheless,  based  on  the  results  of  our  preliminary  investigation,  it  does  represent  a 
novel,  compelling  and  potentially  important  paradigm  for  communication  in  a  variety 
of  military  and  commercial  communication  contexts. 


8.1  Future  Directions 

In  general,  there  are  numerous  outstanding  issues  pertaining  to  the  topics  treated  in 
this  thesis.  Indeed,  many  of  these  issues  have  been  identified  within  the  appropriate 
chapters  of  the  thesis.  Still  others  have  yet  to  be  identified.  Such  issues  suggest 
important  future  directions  for  work  on  the  topic. 

In  the  case  of  Chapter  7,  which  discussed  connections  between  self-similar  signals 
and  self-similar  systems,  our  treatment  was  particularly  cursory.  Much  work  remains 
to  be  done  in  developing  a  unified  framework  from  which  to  view  the  various  results  of 
the  thesis,  and  in  understanding  the  nature  of  the  relationship  between  the  Laplace, 
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Fourier,  Mellin,  and  wavelet  transforms. 

An  example  of  a  problem  not  considered  within  this  thesis,  but  warranting  inves¬ 
tigation,  is  that  of  optimal  prediction  of  1//  processes.  Naturally,  one  can  envision 
numerous  applications  involving  economic,  geophysical  and  other  time  series  where 
optimal  forecasting  algorithms  could  be  exploited.  However,  given  the  particular 
correlation  structure  within  1//  processes,  we  might  reasonably  speculate  that  such 
processes  are  fundamentally  unpredictable,  i.e.,  that  accurate  prediction  is  impossi¬ 
ble  from  observations  corresponding  to  typical  data  lengths  and  SNHs.  In  this  case, 
a  quantitative  result  on  the  predictability  of  1  //  processes  would  be  an  equall}'^  im¬ 
portant  result.  It  is  certainly  plausible  that  at  least  indirectl}'^  our  wavelet-based 
representations  for  1  //  processes  can  be  used  to  obtain  both  prediction  algorithms 
and  performance  bounds  for  this  class  of  signals. 

The  development  of  efficient  representations  for  non-Gaussian  fractal  signal  mod¬ 
els  represents  another  important  direction  for  future  research.  In  this  thesis,  we  have 
restricted  our  attention  principally  to  Gaussian  1//  processes,  largely  for  reasons  of 
tractability.  And  while  Gaussian  1//  processes  are  valuable  models  for  many  real 
signals,  considerably  richer  behavior  can  be  derived  from  non-Gaussian  1//  models 
generated  by  nonlinear  mechanisms.  Consequently,  techniques  for  modeling  non- 
Gaussian  1  If  signals  as  well  as  algorithms  for  processing  such  signals  would  be  po¬ 
tentially  useful  in  an  apparently  wide  range  of  application  contexts. 

Finally,  we  remark  that  there  would  appear  to  be  many  other  potential  applica¬ 
tions  for  the  classes  of  homogeneous  signals  we  have  introduced  in  this  thesis.  Even 
within  the  context  of  communications,  one  can  imagine  a  whole  host  of  additional 
uses  of  homogeneous  signals  beyond  the  particular  fractal  modulation  apiDlication  we 
consider  in  this  thesis.  In  many  respects,  identifying  and  exploring  new  and  poten¬ 
tially  promising  applications  represents  perhaps  the  most  exciting  direction  for  future 
research  suggested  by  this  thesis. 
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Appendix  A 

Derivation  of  the  Discrete  Wavelet 
Transform 


A.l  Analysis  Algorithm 

The  key  to  developing  an  efficient  discrete-time  implementation  of  the  wavelet  de¬ 
composition  lies  in  recognizing  a  useful  recursion.  Because 

£  V.  c  K, 

there  exists  a  pair  of  sequences  h[n]  and  ^[n]  such  that  we  can  express  these  functions 
in  terms  of  a  basis  for  Vi,  i.e., 


4>oi^)  =  (A-la) 

i 

i’oit)  =  (A.lb) 

i 

where  the  coefficient  h[n]  and  g[n]  are  given  by  the  appropriate  projections,  viz., 
(2.20).  Equivalently,  we  may  express  (A.l)  in  the  frequency  domain  as 


$(a;)  =  2-^/2  ^(w/2)$(u)/2)  (A.2a) 

^(a;)  =  2-^I‘^G{uI2)^ujI2).  (A.2b) 
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In  any  case,  multiplying  both  sides  of  (A.l)  by  2"*^^,  replacing  t  with  2’"t  —  n,  and 
effecting  a  change  of  variables  we  get,  more  generally, 


CW  =  Y,h[l-2n]4,T*'(t)  (A.3a) 

I 

C(l)  =  Y,sV-2nl4'r*\t)  (A.3b) 

/ 

where,  in  turn,  we  may  rewrite  (2.20)  as 

/OO 

(A.4a) 

■OO 

/OO 

'(«)*•  (A.4b) 

'OO 

The  discrete-time  algorithm  for  the  fine-to-coarse  decomposition  associated  with 
the  analysis  follows  readily.  Specifically,  substituting  (A. 3a)  into  (2.13)  and  (A. 3b) 
into  (2.5b),  we  get,  for  each  in,  the  filter-downsample  relations  (2.21a)  and  (2.21b) 
defining  the  algorithm. 

A. 2  Synthesis  Algorithm 

The  coarse-to-fine  refinement  algorithm  associated  with  the  synthesis  can  be  derived 
in  a  complementary  manner.  Since 

we  can  write 

=  x:  {M"  -  2/1  «rw+»l»-2lj  *"(()}  (A.5) 

/ 

where  the  last  equality  follows  by  recognizing  the  projections  in  the  respective  expan¬ 
sions  as  (A.4).  The  upsample-filter-merge  relation  (2.21c)  then  follows  immediately 
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Appendix  B 

Proofs  for  Chapter  3 


B.l  Proof  of  Theorem  3.2 


We  first  establish  the  following  useful  lemma. 


Lemma  B.l 


When  a  1/ f  process  x{t) 


is  passed  through  a  filter  with  frequency  re¬ 


sponse 


BaH  =  { 


1 

0 


TTG  <  Icj]  <  27ra 
otherwise 


(B.l) 


for  any  a  >  0,  the  output  ya{t)  is  wide-sense  stationary,  has  finite  variance  and  has 
an  autocorrelation  satisfying 


Byai^)  =  E[yait)ya(t  -T)]  =  a  "^^Ryfiar)  (B.2) 


for  all  a  >  0.  Furthermore,  for  any  distinct  integers  m  and  k,  the  processes  y2”'{t) 
and  y2k{t)  are  jointly  wide-sense  stationary. 

Proof: 


First,  from  Definition  3.1  we  have  immediately  that  y\{t)  is  wide-sense  stationary.  More 
generally,  consider  the  case  a  >  0.  Let  ha{t)  be  the  impulse  response  of  the  filter  with 
frequency  response  (B.l).  To  establish  (B.2),  it  suffices  to  note  that  ya{t)  has  coiTelation 
function 


=  I5;[ya(<)ya(s)] 
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=  f  f  Kit- a)ba{s  -  P)  Rx{a,l3)dadp 
J  — 00  J  — 00 

/oo  roo 

I  bi{at  —  a)  bi{as  —  (3)  da  dp 

-oo  J—oo 


=  Ry^(at,as)  (B.3) 

where  we  have  exploited  the  identities  (3.2b)  and 

ba{t)  =  abi{at). 

However,  since  yi(t)  is  wide-sense  stationary,  the  right  side  of  (B.3)  of  a  function  only 
of  t  —  s.  Hence,  ya{t)  is  wide-sense  stationary  and  (B.2)  follows.  Furthermore,  t/a(t)  has 


variance 


RyA0,0)  =  a-^^Ry,i0,0)<oo 


where  the  inequality  is  a  consequence  of  Definition  3.1.  To  establish  our  final  result,  since 
B2m  (oj)  and  B2k  (w)  occupy  disjoint  frequency  intervals  for  m  ^  fc,  the  spectra  of  y2”>  (<)  and 
y2«.  (t)  likewise  occupy  disjoint  frequency  intervals.  Thus,  y2">(t)  and  y2/t(t)  are  uncorrelated, 
and,  hence,  jointly  wide-sense  sense  stationary  as  well  ■ 

We  now  proceed  to  a  proof  of  our  main  theorem. 

First,  let  us  establish  that  y{t)  is  wide-sense  stationary.  Let  Ml  and  Mu  be  any 
pair  of  integers  such  that 

<ujL<u;u  < 

and  consider  preceding  the  filter  (3.25)  with  a  filter  whose  frequency  response  is 


1  2^^^7r  <  lw|  <  2^^^+V 
0  otherwise 


(B.4) 


since  this  will  not  affect  the  output  y{t). 

Let  y{t)  be  the  output  of  the  filter  {B.4)  when  driven  by  x{t).  Then  since 


J5(m)  —  ^  ^ 

m=ML 

where  is  as  defined  in  (B.l)  of  Lemma  B.l,  we  can  decompose  y{t)  according 


Mv 

y{i)  =  2/2- (i) 

m=-Mi 


(B.5) 
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where  y2^it)  is  the  response  of  the  filter  with  frequency  response  B2m{u})  to  x{t). 
From  Lemma  B.l  we  have  that  the 

{2/2- (0}>  Ml  <  m  <  Mu 

are  jointly  wide-sense  stationary.  Hence,  from  (B.5)  we  see  that  y(t)  is  wide-sense 
stationary.  Then  since  y{t)  is  obtained  from  y{t)  through  the  filter  (3.25),  the  sta- 
tionarity  of  y{t)  is  an  immediate  consequence  of  the  stationarity  of  y{t)  [65]. 

Let  us  now  derive  the  form  of  the  spectrum  of  y{t),  i.e.,  (3.26).  We  begin  by 
rewriting  (B.2)  of  Lemma  B.l  in  the  frequency  domain  as 

Sy^  (au)  =  (uj)  (B.6) 

where  Sy^{u;)  is  the  power  spectrum  associated  with  yait).  For  1  <  a  <  2,  we  observe 
that  5j,i  (uj)  and  Sy^  (uj)  have  spectral  overlap  in  the  frequency  range  air  <  |a;|  <  27r, 
and  can  therefore  conclude  that  the  two  spectra  must  be  identical  in  this  range.  The 
reasoning  is  as  follows.  If  we  pass  either  ya{t)  or  yi{t)  through  the  band  pass  filter 
with  frequency  response 

,  1  OTT  <  la;]  <  27r 

Bt(w)  =  l  - 

0  otherwise 

V 

whose  impulse  response  is  b^{t),  the  outputs  must  identical,  i.e., 

b\t)  *  ya{t)  =  b\t)  *  yi{t)  =  b\t)  *  x{t). 

Since  ya{t),  and  yi{t)  are  jointly  wide-sense  stationary,  we  then  conclude 

5,.(a;)!5t(a;)|2  =  5,,(a;)|Bt(a;)|2 

whence 

=  Sy,(u),  OTT  <  \u;\  <  2n.  (B.7) 
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Combining  (B.7)  with  (B.6)  we  get 


Sy^{au)  =  a  (w),  OTT  <  lw|  <  2tt  (B.8) 

for  any  1  <  a  <  2.  Differentiating  (B.8)  with  respect  to  a  and  letting  a  — >  1  +  ,  we 
find  that 

ujSy^{u))  =  -{2H  +  l)Sy^{u),  TT  <  w  <  27r, 

and  note  that  all  positive,  even,  regular  solutions  solutions  to  this  equation  are  of  the 
form 

tt  <  [u-’l  <  2tt  (B.9) 

for  some  >  0  and  7  given  by  (3.7).  Using  (B.9)  with  (B.6)  we  find,  further,  that 

J  2-7r  <  |a-|  <  2’"+^7r 

(^0  = 

0  otherwise 


Via  Lemma  B.l,  the  y2"'{t)  are  uncorrelated,  so  we  deduce  that  y{t)  has  spectrum 


Mu 

5sH=  E  = 


m=Mi 


2^^^7r  <  |(j;|  < 

0  otherwise 


Finally,  since 

s,M  =  IBHpSjM 

our  desired  result  (3.26)  follows  ■ 


B.2  Proof  of  Theorem  3.3 

To  show  that  a  fractional  Brownian  motion  x{t),  for  0  <  <  1,  is  a  1//  process 

according  to  Definition  3.1,  it  suffices  to  consider  the  effect  on  x{t)  of  any  LTI  filter 
with  a  regular  finite-energy  impulse  response  b{t)  and  frequency  response  B{uj)  satis¬ 
fying  B{ui)  =  0.  In  particular,  since  x{t)  has  correlation  given  by  (3.16),  the  output 
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of  the  filter 


/oo 

h{t  -  t)  x{t)  dr  (B.IO) 

-OO 

has  autocorrelation 

Ry{t,s)  =  E[y{t)y{s)] 

<T?r  r®® 

=  —  /  b{v)dv  /  \t  —  s  +  u  —  b{u)  du 
L  J—oo  J—oo 

as  first  shown  by  Flandrin  [29].  Since  Ry{t,  s)  is  a  function  only  of  t  —  s,  the  process 
is  stationary,  and  has  spectrum 

When  we  restrict  our  attention  to  the  case  in  which  B(uj)  is  the  ideal  bandpass  filter 
(3.24),  we  see  that  y(t)  is  not  only  stationary,  but  has  finite  variance.  This  establishes 
that  an}'  fractional  Brownian  motion  x{t)  satisfies  the  definition  of  a  1//  process. 

That  the  generalized  derivative,  fractional  Gaussian  noise  x'{t),  is  also  a  1// 
process  follows  almost  immediately.  Indeed,  when  x'{t)  is  processed  by  the  LTI  filter 
with  impulse  response  b(t)  described  above,  the  output  is  y'{t),  the  derivative  of 
(B.IO).  Since  y{t)  is  stationary,  so  is  y'{t).  Moreover,  y'{t)  has  spectrum 

5,. (a,)  =  |B(i.,)p  ■ 

w'here  H'  is  as  given  by  (3.20).  Again,  when  B{u))  is  given  by  (3.24),  y'{t)  is  not  only 
stationary,  but  has  finite  variance,  which  is  our  desired  result  ■ 

B.3  Proof  of  Theorem  3.4 

Without  loss  of  generality,  let  us  assume  =  1.  Next,  w'e  define 

M 

(B.ll) 
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as  a  resolution-limited  approximation  to  x{t)  in  which  information  at  resolutions 
coarser  than  and  finer  than  2^  is  discarded,  so 

x(()=  lim  x„(()  =  EEOCW. 

Since  for  each  m  the  wavelet  coefficient  sequence  a;”*  is  wide-sense  stationary  with 
spectrum  2“'’'™,  the  approximation  is  cyclostationary  [65]  with  period  2^,  has 

finite  variance,  and  has  the  associated  time-averaged  spectrum 

M 

Sm{u;)=  £  2~'>”^\^{2-^u)\\  (B.12) 

TO=  — Af 

The  limiting  time-averaged  spectrum 

S^{u))  =  lim  5mM 

M— ^OO 

gives  the  desired  spectrum  expression  (3.36),  and  corresponds  to  the  time-averaged 
spectrum  of  x{t)  as  measured  at  the  output  of  a  bandpass  filter  for  each  frequency  w 
in  the  passband.  The  desired  octave-spaced  ripple  relation  (3.38)  for  arbitrary  integer 
k  follows  immediately  from  (3.36). 

To  establish  (3.37),  we  begin  by  noting  that,  given  u,  we  can  choose  mo  and  uq 
such  that  uj  =  2"^'^uo  and  1  <  jwol  <  2.  Hence,  using  (3.38)  Ave  see 

S,{uj)  =  2-"‘<'^5,(a»o) 


from  which  it  follows  that 


,  inf  5:c(‘^o) 

<  Sx(u')  < 

sup  Sx{uJo) 

l<laJo|<2 

UJ  ^ 

.l<lwo|<2 

It  suffices,  therefore,  to  find  upper  and  lower  bounds  for  Sx{ijJq)  on  1  <  jwol  <  2. 
Since  tpit)  is  /?th-order  regular,  ^(u;)  decays  at  least  as  fast  as  l/uj^  as  a;  — »  oo. 
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This,  together  with  the  fact  that  is  bounded  according  to  (2.8a),  implies  that 

for  some  C  >  1.  Using  this  with  (2.14a)  in  (3.36)  leads  to  the  upper  bound 

OO  OO 

5'x(c^o)  <  2-^™  +  2^”*C’22-2«'"  <  OO. 

Tn=0  m=l 

To  establish  the  lower  bound  it  suffices  to  show  Sx{u!)  >  0  for  every  1  <  a;  <  2, 
which  we  establish  by  contradiction. 

Suppose  for  some  1  <  wq  <  2, 

5,(uro)  =  E2~"'"l^(2-”’-'o)l-  =  0 

m 

Then  since  all  the  terms  in  the  sum  are  non-negative,  this  would  imply  that  each 
term  is  zero,  from  which  we  could  conclude 

X;i<P(2-”a.„)r^=0. 

m 

However,  this  contradicts  the  wavelet  basis  identity  (2.9).  Hence,  we  must  have  that 
S{ijj)  >  0  for  every  tt  <  u-’o  <  27r.  The  complete  theorem  follows  ■ 

B.4  Proof  of  Proposition  3.5 

We  begin  by  defining  the  process  XK(t)  as  the  result  of  filtering  x{t)  with  the  ideal 
bandpass  filter  whose  frequency  response  is  given  by 


Bk{ui) 


1  2-^^  <  ju;|  <  2^ 
0  otherwise 


so  that 


lim  xj{{t)  =  x(t). 

A'— OO 
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Then  by  Theorem  3.2,  XK{t)  is  wide-sense  stationary  and  has  power  spectrum 


S’a-(w)  = 


all\uj\^  2--f'  <  ]w|  <  2^‘ 
0  otherwise 


If  we  denote  its  corresponding  autocorrelation  by 


Rk{t)  =  E  [xK{t)xK{t  -  r)] 


and  its  wavelet  coefficients  by 


/OO 

-OO 


the  correlation  between  wavelet  coefficients  may  be  expressed  as 


'*  *'  J—oo  J—oo 

=  r  c  w  ■  K-(<)  * «'(()]  (B.13) 


Appljdng  Parseval’s  theorem  and  exploiting  (B.4),  Ave  may  rewrite  (B.13)  in  the  fre¬ 
quency  domain  as 

2-{m+m')/2  f  r-2-^  ^2 

£  [x:(A>J  (A-)] - - - U 

+  j(r..  .  (B.M) 


Interchanging  limits,  we  get 


<  =  lim  x:\K) 

A— +00 


and,  in  turn, 


(B.15) 


Finally,  substituting  (B.14)  into  (B.15)  yields  (3.40)  as  desired 
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B.5  Proof  of  Theorem  3.6 


Let  us  define 

A  =  2-^n  -  2-^'n' 

and 

E(u;)  = 

for  a;  >  0,  so  that  (3.41)  may  be  expressed,  via  (3.40),  as 

TTcr-^ 

where 

/(A)  =  /°°  E(a>)  dcj.  (B.17) 

Jo 

Thus,  to  establish  the  desired  result,  it  suffices  to  show  that  /(A)  has  the  appropriate 
decay. 

We  first  note  that  if  7  >  2R  +  1,  then  we  cannot  even  guarantee  that  /(A) 
converges  for  any  A.  Indeed,  since 


E(a;)  ~  O  ,  u 0 

we  see  that  /(A)  is  not  absolutely  integrable.  However,  provided  7  <  2R,  /(A)  is 
absolutely  integrable,  i.e., 

roo 

/  |E^‘^^(ci;)|  du  <  00. 

Jo 

In  this  case,  we  have,  by  the  Riemann-Lebesgue  lemma  [35],  that 

/(A)  -^0,  A  ->  00. 

When  0  <  7  <  2i?,  we  may  integrate  (B.17)  by  parts  Q  times,  for  some  positive 
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integer  Q .  to  obtain 


/(A)  = 


— ^  r  du 

{jA)Q  Jo  ^  ’ 

+ Tik  {is  '""1  -  JS»  }  ■  (BIS) 


du 


Due  to  the  vanishing  moments  of  the  wavelet  we  have 


~  O  (a;2«-T'-9)  ,  a;  0  (I  ') 

while  due  to  the  regularity  of  the  wavelet,  '^(w)  decays  at  least  as  fast  as  l/to^  as 
iiJ  — >  oo,  whence 

E(‘'>(u;)  ~  O  ,  cj-^oo.  (B.20) 

Hence,  the  limit  terms  in  (B.18)  for  which  — 2i?  —  7  <  5  <  2R  —  7  all  vanish. 

Moreover,  when  we  substitute  g  =  Q,  (B.19)  and  (B.20)  imply  that  is 

absolutely  integrable,  ie., 

fOO 

/  lE(‘^)(u;)|  <  00,  (B.21) 

Jo 

whenever  —2R  —  7  +  1  <  Q  <  2R  —  7  +  1,  which  implies,  again  via  the  Riemann- 
Lebesgue  lemma,  that  the  integral  in  (B.18)  vanishes  asjunptoticall}^  ie., 

r  dw-^O,  A  ^  00.  (B.22) 

Jo 

Hence,  choosing  Q  =  \2R  -  7]  in  (B.18)  (so  2jR  -  7  <  Q  <  2R  —  7  +  1)  allows  us  to 
conclude 

1^0  (a-^2«-71)  ^  A  00.  (B.23) 

Substituting  (B.23)  into  (B.16)  then  yields  the  desired  result  ■ 
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Appendix  C 

The  EM  Parameter  Estimation 
Algorithm 


In  this  appendix,  we  derive  the  EM  algorithm  for  the  estimation  of  the  signal  and 
noise  parameters  0  =  {/?,  for  the  scenario  described  in  Section  4.2. 

We  begin  by  defining  our  observed  (incomplete)  data  to  be 

r  =  {C,  €  7^}, 


and  our  complete  data  to  be  (x,  r)  where 


X  =  {a;™,  m,n  E  TZj. 

Consequently,  the  EM  algorithm  for  the  problem  is  defined  as  [63] 
F  step:  Compute 

t/(0,0W) 

M  step: 


max[/{0,©''') 
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where 


L/(0,©)  =  E[lnpr,x(r,x;©)lr;©]. 


For  our  case,  U  is  obtained  conveniently  via 


[/(©,©)  =  E  [lnprix(r|x;0)  +  lnpx(x;©)|r;© 


1  Cr."*  — 

Pr|x(rlx;©)=  n  ”2(72"  . 

m,ne72 


Then 


Px(x;©)=  n 


m,n^Ti 


cr-'/?-”*  2a^(5-" 


[-•(©, 0)  =  -i  E  A'(m)(is:{©)  +  ln2:r<Tj  +  -j^S;(0)  +  hi2,r<T’ 


where 


n€Ar(m) 


=  wR.?  ^[(<>'1’'"’®! 


are  (quasi)  conditional  sample- variance  estimates  from  the  data  based  upon  the  model 
parameters  ©.  Evaluating  the  expectations  we  get 

SZ(©)  =  /!„(©)  + 

S;(®)  =  /!„(©)  + B'{0)a^ 


where 


Ami®)  = 


crj  • 

al  +  ^2/?-"* 
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Bzm  = 


BU®)  = 


a2  +  a‘^l3-”^J 


which  completes  our  derivation  of  the  E  step. 

To  derive  the  structure  of  the  M  step,  we  ma.xiniize  ?/(©,©)  as  given  (C.l). 
This  maximization  is  always  well-defined  as  [/(©,©)  <  T(©)  for  any  ©,©. 

The  local  extrema  are  obtained  by  differentiating  l/(©,©)  with  respect  to  each 
of  the  parameters  of  ©.  Since  (C.l)  expresses  U(@,  ©)  as  the  sum  of  two  terms,  one 
of  which  depends  only  on  and  the  other  of  which  depends  only  on  P  and  a^,  the 
maximization  can  be  broken  down  into  two  independent  parts. 

Considering  first  our  maximization  over  <7^,  we  readily  obtain  the  maximizing 
as  the  sample-average 

Y.  N(m)SZ{e) 


~2  meM 
; 


E 

mCAl 

Turning  next  to  P  and  a-,  we  find  that  the  maximizing  parameters  p  and  a- 


satisfy 


Y  N(m)si{@)r  =  <’^Y  ^(™) 

m^M  vi^M 

E  mA^(m)5^(©)/?”*  =  E 


(C.2a) 

(C.2b) 


Eliminating  <7^  we  obtain  that  P  is  the  solution  of  the  pol}'-nomial  ecpiation 


E  CmN{m)Sl,{@)P^  =  0, 


(C.3) 


where  Cm  is  as  defined  in  (4.16).  The  eliminated  variable  is  trivially  obtained  by 


back-substitution; 


Y  N(m)S’„{@W 


i.2  m€M 
(J  =  - 


E 
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Finally,  to  sliow  that  the  maximizing  parameters  are  the  only  solution  to  (C.2)  it 
suffices  to  show  that  the  solution  to  (C.3)  is  unique,  which  we  establish  via  the 
following  lemma. 

Lemma  C.l  Any  polynomial  equation  of  the  form 


^  CmKrnr  =  0  (C.4) 


where  Cm  is  given  by  (4- 16)  and  Km  >  0  has  a  unique  positive  real  solution  provided 
M  >  2  and  not  all  Km  ore  zero. 


Proof: 


Let 


mN{m) 

Z]  N{m) 


be  a  weighted  average  of  the  m  £  M,  so  mi  <  m,  <  mM-  Then,  from  (4.16),  for  m  >  m,, 
Cm  >  0,  while  for  m  <  m,.  Cm  <  0.  Hence,  Cm{m  —  m*)  >  0  with  strict  inequality  for 
at  least  two  values  oi  m  £  M  from  our  hypothesis.  Now  let  /(/3)  be  the  left-hand  side  of 
(C.4),  and  obseiwe  that 

fiP)  =  /(yS)/?-™* 


is  increasing  for  0  >  0,  i.e.. 


ri0)=  E  C'm(m-m.)JV(m)d2./?'”— •-!  >0. 


Then,  since  /(O)  =  — oo  and  /(oo)  =  oo,  we  see  f{0)  has  a  single  real  root  on  ^  >  0.  Since 
/  {0)  shares  the  same  roots  on  >  0,  we  have  the  desired  result.  ■ 

This  completes  our  derivation  for  the  M  step.  The  complete  algorithm  follows 

directly. 
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Appendix  D 

Proofs  for  Chapter  5 


D.l  Proof  of  Theorem  5.2 


'\^’e  begin  by  noting  that  we  may  insert  any  ideal  bandpass  filter  with  a  passband 
that  includes  ui  <  \uj\  <  lou  before  the  filter  specified  b}^  (5-4)  without  changing  the 
overall  S3'’steni.  We  can  construct  such  a  filter  as  the  finite  parallel  combination  of 
filters  with  adjacent,  octave  width  passbands.  In  particular,  let  Xm{t)  be  the  output, 
when  driven  by  x{t),  of  the  filter  with  frequency  response 


=  { 


1 

0 


2™7r  <  |wl  <  2"’+i7r 
otherwise 


(D.l) 


and  corresponding  impulse  response  bjn{t).  By  definition  of  x{t)  we  then  know  that 
XQ{t)  has  finite  energy.  Furthermore,  we  get 


x„i(t)  =  f  x{cx)  bm{t  —  a)  da  =  2  '”^£‘o(2'"t) 
J—OO 


(D.2) 


where  the  last  inequality  results  from  using  a  change  of  variables  together  with  the 
self-similarity  relation  (5.2)  and  the  identity 


but)  =  2”%i2^t). 
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Hence,  every  x„,(t)  has  finite  energy. 

Next,  choose  finite  integers  Ml  and  Mu  such  that  <  ul  and  u)u  < 


Then 


Mu 

KO  =  E  ^rn{t) 

m=-ML 


is  the  impulse  response  of  an  ideal  bandpass  filter  with  passband  <  |u;|  < 

2Mu+\^_  When  this  filter  is  driven  by  x{i)  the  output  is 

Mu 

z{t)  =  52 

m=Mi 

Because  the  Xm{t)  have  Fourier  transforms  supported  on  disjoint  frequency  intervals, 
they  are  orthogonal.  Hence,  z{t)  has  energy 


/OO  roo 

z\t)dt=z  ^  f  xl{t)dt 

■OO  _ J-OO 


''  ^  m^ML 

which  is  finite  since  each  of  the  £„,(*)  has  finite  energy. 

Finally,  when  we  cascade  the  filter  having  impulse  response  b{t)  with  the  filter 
having  frequency  response  (5.4),  the  overall  frequency  response  of  the  cascade  is,  of 
course,  (5.4)  by  our  choice  of  Ml  and  Mu-  Hence,  when  the  cascade  is  driven  by  x{t), 
the  intermediate  signal  is  z{t)  and  the  final  output  is  necessarily  y{t)  as  defined  in 
the  statement  of  the  theorem.  We  can  then  conclude  that  y{t)  has  finite  energy  since 


1  r°° 
27r  J^oo 


/OO  _  1  rufu  1  /*oo 

iy(a;)l2  du  =  -  lZ(a;)|2  dw<-  \Z{u)f  dw  <  oo. 

■OO  TT  Jui  TT  JQ 


1  /‘°°  , 


To  show  the  spectrum  relation  (5.5),  it  suffices  to  note  that 


yM  =  B(w)5:x„H  = 


x{uj)  Ul  <  |a;|  <  Uu 
0  otherwise 


where 
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Note  that  this  sum  is  convergent  because  for  each  u  only  one  term  in  the  sum  is 
non-zero.  Finally,  applying  the  self-similarity  relation  (D.2)  gives 

m 

which,  as  one  can  readily  verify,  satisfies  (5.6)  ■ 


D.2  Proof  of  Theorem  5.3 

We  begin  by  noting  that  the  energy  in  the  output  a:o(t)  can  be  expressed  as 

A  roo  1  foo  _  1 

E=  xl{t)  dt  =  -  \Xo{u)\^  du  =  Yl-  \X(u)f  |^(u;)|2  du 

7-00  TT  Jo  m  “'2'"5r 

where  ^0(0;)  is  the  Fourier  transform  of  Xo(t),  and  where  W(a;)  is  as  defined  in  The¬ 
orem  5.2.  Exploiting  the  ripple  relation  (5.6),  we  get,  through  a  change  of  variables, 

£:=  1  r'|X(a;)|2S(a;)dw 

TT  7jr 

where  the  non-negative  spectrum  S(u))  is  defined  as 


5(u;)  =  2-^'"|^(2”*u;)|2.  (D.3) 

m 

Let  us  first  show  that  xo(t)  has  finite  energy  if  x(t)  is  energy-dominated.  Since, 
F  <  sup  S(u>)  —  f  \X{u})^  du 

ir<u<2ir  _  TT  Jtt 

it  suffices  to  show  that  S{u)  <  00  on  tt  <  a;  <  27r.  The  vanishing  moment  condition 
(5.17),  together  the  fact  that  'F(a;)  is  bounded,  implies  there  exists  a  C  >  0  such  that 


< 
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for  all  bj.  Using  this  with  (2.8a)  in  (D.3)  gives  the  desired  bound 

OO  OO 

‘^(^)  <  H  2"^""  +  2''"‘C22-2""X27r)2^  <  oo 

m=0  m=l 

for  cj  <  2;r  and  0  <  7  <  2R. 

Let  us  now  show  the  converse,  ie.,  that  x{t)  is  energy  dominated  if  Xo{t)  has  finite 
energy.  Since 

E>  inf  5(a;)l  -  \X{u)\‘^  du 

-  L;r<w<27r  ^  TT  A 

it  suffices  to  show  S{u})  >  0  for  every  tt  <  a;  <  27r,  which  we  establish  by  contradiction. 
Suppose  for  some  tt  <  wq  <  27r, 

5(u>o)  =  E2""'"I’J'(2’"cco)P  =  0. 

m 

Then  since  all  the  terms  in  the  sum  are  non-negative,  this  would  imply  that  each 
term  is  zero,  from  which  we  could  conclude 

5;|4(2”'u,o)P  =  o. 

m 

However,  this  contradicts  the  wavelet  basis  identity  (2.9).  Hence,  we  must  have  that 
S{u>)  >  0  for  every  tt  <  cjq  <  27r.  The  theorem  follows  ■ 

D.3  Proof  of  Theorem  5.6 

Following  the  approach  of  the  proof  of  Theorem  5.2  in  Appendix  D.l,  we  begin  by 
noting  that  we  ma}^  insert  any  ideal  bandpass  filter  with  a  passband  that  includes 
u!l  <  |ti;|  <  u)u  before  the  filter  specified  by  (5.4)  without  changing  the  overall  system. 
We  can  construct  such  a  filter  as  the  finite  parallel  combination  of  filters  with  adjacent, 
octave  width  passbands.  In  particular,  let  Xm{t)  be  the  output  of  the  filter  with 
frequency  response  (D.l)  and  corresponding  impulse  response  bm{t),  when  driven  by 
x{t).  By  definition  of  x{t)  we  then  know  that  Xo(t)  has  finite  power.  Furthermore, 
from  (D.2)  we  get  that  every  Xm{t)  has  finite  power. 


Next,  again  choose  finite  integers  Mi  and  Mu  such  that  <  ui  and  uu  < 

2^4'+i;r.  Then 

Mu 

^W=  £ 

m=Mi 

is  the  impulse  response  of  an  ideal  bandpass  filter  with  passband  <  ja;|  < 

2A/u+i^  When  this  filter  is  driven  by  x{t)  the  output  is 

Mu 

z(t)  =  2  £■„,(<). 

m=Mi, 

Because  the  Xm{t)  have  power  spectra  supported  on  disjoint  frequenc)"  intervals,  their 
cross-correlations  are  zero.  Hence,  z{t)  has  power 

1  i-T  1  rT 

lim  — —  /  z^{t)dt=  Y'  lim  ——  /  x\{t)dt 
T-.OC  2r  J-T  ^  2T  J-T  ’ 

which  is  finite  since  each  of  the  Xm{t)  has  finite  power. 

When  we  cascade  the  filter  with  impulse  response  b{t)  with  the  filter  whose  fre- 
cpiency  response  is  (5.4),  the  overall  frequency  response  of  the  cascade  is  course  (5.4) 
by  our  choice  of  Mi  and  Mu-  Hence,  when  the  cascade  is  driven  by  x{t),  the  interme¬ 
diate  signal  is  z{t)  and  the  final  output  is  necessarily  y{t)  as  defined  in  the  statement 
of  the  theorem.  We  can  then  conclude  that  y{t)  has  power 

I  foo  1  fu;u  1  roo 

/  Sy{u)du  =  -  /  5,(w)du;  <  -  /  S,{u)du; 

ZTT  J—oc  TT  TT  JO 

which  is  finite  because  the  last  term  on  the  right  is  the  power  in  z{t). 

To  show  the  spectrum  relation  (5.25),  it  suffices  to  note  that 


m 


X{u)  u>i  <  |a;|  <  u)u 
0  otherwise 


where 

m 
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Again,  since  for  each  value  of  u  only  one  term  is  non-zero,  the  sum  is  convergent. 
Finally,  applying  the  self-similarity  relation  (D.2)  gives 

S.M  =  £2-’”‘Sj.(2-”w) 

m 

which,  as  one  can  readily  verify,  satisfies  (5.26)  ■ 


D.4  Proof  of  Theorem  5.7 

We  first  establish  some  notation.  Let  us  denote  the  cross-correlation  between  two 
finite-porver  signals  f{t)  and  g{t)  by 

^  /_y  /(O  9{t  -  r)  dt. 

Its  Fourier  transform  is  the  corresponding  cross-spectrum  Sf^g{u).  Similarly 

Kb[k]  =  lim  a[n]  b[n  -  k] 

will  denote  the  cross-correlation  between  two  finite-power  sequences  0(71]  and  5[7?]. 
We  begin  by  expressing  x{t)  as 


x(t)  —  ^ 

m 


where 

!„(«)  = 

n 

Then  the  deterministic  power  spectrum  of  x{t)  is  given  by 


m  m' 

We  shall  proceed  to  evaluate  these  various  terms.  Because  of  the  dilational  relation- 
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ships  among  the  viz., 


x^t)  = 

it  shall  suffice  to  consider  a  single  term  of  the  form  Sxo,x„,it),  for  some  m  >  0. 
Hence,  let 

-  2"”«) 


and  note  that 


Mi)  =  -  2 


where  5[?i]  is  an  upsampled  version  of  q\n],  i.e., 


q[2-^n]  n  =  2^l,  /  =  -1, 0, 1, 2, .. . 

0  otherwise 


Hence, 


where 


k 


^9,9  W  =  lim  E  9[2  '"n]  q\n  -  k]  =  lini  — ^  ^  -  A:]. 

1  |n|<I,n=2-n/  21,  +  1 

Since  ^[n]  is  correlation-ergodic,  we  may  replace  this  correlation  with  its  expected 
value: 


1  ^  (5(A:]  m  =  0 

=  lim  ^7-T  E  <[(2"  -  1)!  -  J:]  = 

L-*oo  ZL  +  1  0  otherwise 


Hence, 


Svo,v,„{‘^) 


1  m  =  0 
0  otherwise 


where,  without  loss  of  generality,  we  have  set  cr^  -  1.  Then,  using 


we  get  that 


Sxo,Xmi^)  ~ 


|^{a;)P  m  =  0 
0  otherwise 


Finally,  we  note  that 


and  that 


Using  these  identities  together  with  (D.5)  in  (D.4)  yields 


(D.5) 


s,(u.)  =  5: 


as  desired  ■ 
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